From phma@webjockey.net Fri Dec 20 17:51:05 2002 Return-Path: X-Sender: phma@ixazon.dynip.com X-Apparently-To: lojban@yahoogroups.com Received: (EGP: mail-8_2_3_0); 21 Dec 2002 01:51:05 -0000 Received: (qmail 75906 invoked from network); 21 Dec 2002 01:51:05 -0000 Received: from unknown (66.218.66.218) by m11.grp.scd.yahoo.com with QMQP; 21 Dec 2002 01:51:05 -0000 Received: from unknown (HELO neofelis.ixazon.lan) (208.150.110.21) by mta3.grp.scd.yahoo.com with SMTP; 21 Dec 2002 01:51:04 -0000 Received: by neofelis.ixazon.lan (Postfix, from userid 500) id E88DD3C477; Fri, 20 Dec 2002 20:51:03 -0500 (EST) Content-Type: text/plain; charset="iso-8859-1" To: lojban@yahoogroups.com Subject: Re: [lojban] speech recognition Date: Fri, 20 Dec 2002 20:51:02 -0500 X-Mailer: KMail [version 1.2] References: In-Reply-To: X-Spamtrap: fesmri@ixazon.dynip.com MIME-Version: 1.0 Message-Id: <0212202051020H.17068@neofelis> Content-Transfer-Encoding: 8bit Sender: phma@ixazon.dynip.com From: Pierre Abbat X-Yahoo-Group-Post: member; u=92712300 On Friday 20 December 2002 20:10, "eerotorri wrote: > Hi (yes, I'm new to this list), > > I'm investigating if lojban would be the thing that I should invest > some of my time. I'm planning to use the fine speech recognition > package of > > http://www.speech.cs.cmu.edu/sphinx/ > > teach it to understand lojban and hook this up to CMUCL to parse > things and produce a nice, extensible and robust command language for > computers.(Why, perhaps you could call me a geek :-) > > Now, I'd like to hear your opinion if this is something that already > exists in some form or if there would be an easier way without > learning lojban and instead do X > > Other thing is that I saw in some document a note that lojban should > be easy to recognize with the speech recognition algorithms. Has this > been tested or has the set of phonemes been somehow selected from > statistical information to be optimally distributed in the recognition > space? > > If my question makes no sense to you it might be because I'm not > actually a specialist in languages nor in speech recognition > technology but rather a programmer. There has been some work toward synthesis of Lojban speech. You may want to look at the list of phones used for that. I am currently revising and attempting to prove the validity of the valfendi algorithm. So far I have a program that lexes cmene and cmavo; I still have brivla to work on, and there are some complicated rules. One small phoneme change can turn a validly lexed word into gibberish or another valid word. Some examples: /varKIClafLO'i/ is lexed as either an error {varkicla *flo'i} or the correct {varkiclaflo'i} "hovercraft" (I'm not sure which, since I haven't programmed that part of the algorithm yet, but there is some provision for guessing secondary stress). /varKIClafLOxi/ is lexed as the meaningless {varkicla floxi}. /noltroNI'u/ and /noltruNI'u/ are both valid words. Both appear in Alice in Wonderland. The Duchess is called the first and the Queen the second. phma