From phma@webjockey.net Fri Dec 20 17:51:05 2002
Return-Path: <phma@ixazon.dynip.com>
X-Sender: phma@ixazon.dynip.com
X-Apparently-To: lojban@yahoogroups.com
Received: (EGP: mail-8_2_3_0); 21 Dec 2002 01:51:05 -0000
Received: (qmail 75906 invoked from network); 21 Dec 2002 01:51:05 -0000
Received: from unknown (66.218.66.218)
  by m11.grp.scd.yahoo.com with QMQP; 21 Dec 2002 01:51:05 -0000
Received: from unknown (HELO neofelis.ixazon.lan) (208.150.110.21)
  by mta3.grp.scd.yahoo.com with SMTP; 21 Dec 2002 01:51:04 -0000
Received: by neofelis.ixazon.lan (Postfix, from userid 500)
  id E88DD3C477; Fri, 20 Dec 2002 20:51:03 -0500 (EST)
Content-Type: text/plain;
  charset="iso-8859-1"
To: lojban@yahoogroups.com
Subject: Re: [lojban] speech recognition
Date: Fri, 20 Dec 2002 20:51:02 -0500
X-Mailer: KMail [version 1.2]
References: <au0f1k+ro3s@eGroups.com>
In-Reply-To: <au0f1k+ro3s@eGroups.com>
X-Spamtrap: fesmri@ixazon.dynip.com
MIME-Version: 1.0
Message-Id: <0212202051020H.17068@neofelis>
Content-Transfer-Encoding: 8bit
Sender: phma@ixazon.dynip.com
From: Pierre Abbat <phma@webjockey.net>
X-Yahoo-Group-Post: member; u=92712300

On Friday 20 December 2002 20:10, "eerotorri wrote:
> Hi (yes, I'm new to this list),
>
> I'm investigating if lojban would be the thing that I should invest
> some of my time. I'm planning to use the fine speech recognition
> package of
>
> http://www.speech.cs.cmu.edu/sphinx/
>
> teach it to understand lojban and hook this up to CMUCL to parse
> things and produce a nice, extensible and robust command language for
> computers.(Why, perhaps you could call me a geek :-)
>
> Now, I'd like to hear your opinion if this is something that already
> exists in some form or if there would be an easier way without
> learning lojban and instead do X
>
> Other thing is that I saw in some document a note that lojban should
> be easy to recognize with the speech recognition algorithms. Has this
> been tested or has the set of phonemes been somehow selected from
> statistical information to be optimally distributed in the recognition
> space?
>
> If my question makes no sense to you it might be because I'm not
> actually a specialist in languages nor in speech recognition
> technology but rather a programmer.

There has been some work toward synthesis of Lojban speech. You may want to 
look at the list of phones used for that.

I am currently revising and attempting to prove the validity of the valfendi 
algorithm. So far I have a program that lexes cmene and cmavo; I still have 
brivla to work on, and there are some complicated rules. One small phoneme 
change can turn a validly lexed word into gibberish or another valid word. 
Some examples:

/varKIClafLO'i/ is lexed as either an error {varkicla *flo'i} or the correct 
{varkiclaflo'i} "hovercraft" (I'm not sure which, since I haven't programmed 
that part of the algorithm yet, but there is some provision for guessing 
secondary stress). /varKIClafLOxi/ is lexed as the meaningless {varkicla 
floxi}.

/noltroNI'u/ and /noltruNI'u/ are both valid words. Both appear in Alice in 
Wonderland. The Duchess is called the first and the Queen the second.

phma

