Hi all,
I'm trying to build a Lojban speech recognition called tersku. Instead of building an acoustic model by hand (which may need many manpower and takes a long time), the attempt is to take the English acoustic model (which is pretty mature) and adapt it for Lojban sounds.
A running prototype can be found at
https://git.null.tl/tersku.git (use
git://git.null.tl/tersku.git to clone). The prototype uses a unmodified version of CMU's generic English acoustic model, with only necessary dictionary and grammars to parse the text "le tanxe be le birka cu cpana le tanxe be le botpi". To use it, recording a version of the text "le tanxe be le birka cu cpana le tanxe be le botpi", convert the recording to wav format, and replace the /resources/org/lojban/tersku/recording.wav file with it. The program will output the best "hypothesis" for the text.
The program does not work really well. That means there's lots of work and I would appreciate your help. Below are some details of things to be done.
About the Programtersku uses CMU's Sphinx speech recognition engine. You can find Sphinx's tutorials and documentations at
http://cmusphinx.sourceforge.net.
Adapt the Acoustic ModelThe adaptation requires some 16KHZ single-channel wav recordings. Help are appreciated if someone can create a Lojban phrase recording collection. Note that a phrase recording collection will benefit the whole Lojban community but not just the speech recognition program :)
Finish the DictionaryThe dictionary in the prototype locates at resources/org/lojban/tersku/jbo-1.dict. Because we are trying to adapt the English acoustic model, all the phones are represented in Arpabet (
https://en.wikipedia.org/wiki/Arpabet). We will need to a) confirm which arpabet symbol represents which Lojban sound, and b) write a program that generates all the words in "[lojban word] [arpabet symbols]". This is probably dependent of the adaptation of the acoustic model.
Finish the GrammarThe grammar needs to be written in JSGF format (
http://cmusphinx.sourceforge.net/wiki/tutoriallm). This haven't been started yet (which needs help!).
Correct Me!There must be mistakes and errors both in the codes and in the recognition details (I'm new to speech recognition!).
Feel free to reach me at this email address or by opening an task at
https://phabricator.null.tl. I'm really looking forward to a Lojban speech recognition tool, because it should be one of the features of Lojban :)
Wei
mu'o mi'e la sorpa'as