From arntrich@stud.ntnu.no Fri Feb 16 06:28:01 2001 Return-Path: X-Sender: arntrich@stud.ntnu.no X-Apparently-To: lojban@onelist.com Received: (EGP: mail-7_0_3); 16 Feb 2001 14:27:57 -0000 Received: (qmail 51626 invoked from network); 16 Feb 2001 14:27:57 -0000 Received: from unknown (10.1.10.27) by l8.egroups.com with QMQP; 16 Feb 2001 14:27:57 -0000 Received: from unknown (HELO due.stud.ntnu.no) (129.241.56.71) by mta2 with SMTP; 16 Feb 2001 14:27:57 -0000 Received: from localhost (localhost [127.0.0.1]) by due.stud.ntnu.no (Postfix) with ESMTP id 47D3217A6D for ; Fri, 16 Feb 2001 15:27:21 +0100 (CET) Received: from hff103-26 (dhcp-29183.stud.hf.ntnu.no [129.241.29.183]) by due.stud.ntnu.no (Postfix) with SMTP id EC72117A79 for ; Fri, 16 Feb 2001 15:26:12 +0100 (CET) Message-Id: <3.0.5.32.20010216144733.01071380@pop.stud.ntnu.no> X-Sender: arntrich@pop.stud.ntnu.no X-Mailer: QUALCOMM Windows Eudora Pro Version 3.0.5 (32) Date: Fri, 16 Feb 2001 14:47:33 +0100 To: lojban@yahoogroups.com Subject: Re: [lojban] speech synthesizer In-Reply-To: References: <3.0.5.32.20010215162902.01094cd0@pop.stud.ntnu.no> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Virus-Scanned: by AMaViS perl-10 From: Arnt Richard Johansen X-Yahoo-Message-Num: 5503 [Word-based Open Source Speech Synthesis] >Isn't this somewhat, well, not the way to do it, regarding Lojban's >phonology/morphology? I mean, it would be much easier just to program >simple syllable-making functions, and let the program do the (amazing >difficult) task of splitting the words into syllables. I shall be the first one to admit that a project like this is futile. However, doing it The Right Way(tm) is beyond our capabilities as well. Splitting Lojban *text* into syllables is trivial. What is highly difficult, however, is segmenting recorded speech in such a way that it sounds reasonably natural when it is pieced back together. This has nothing to do with the level of complexity in a language, but the way the phones (speech sounds) blend into each other in any language. You can't take the the "n" of "kantu", the "e" of "sevzi", and the "i" of "mi", splice them together, and end up with "nei". In normal speech, the word "nei" consists of continuously changing frequencies, and you can't really tell where the "n" ends and the "e" begins; or where the "e" ends, and the "i" begins. Anyone interested in the topic of high-quality speech synthesis might want to take a look at http://tcts.fpms.ac.be/synthesis/mbrola.html. >I mean, you can't >teach the program all lujvo, or all fu'ivla, or even all cmavo >combinations. If I'm very bored, I just might! :) --=20 Arnt Richard Johansen | - Hvorfor snakker man engelsk p=E5 Internet? http://people.fix.no/arj/ | - Har du h=F8rt om "minste felles nevner"? arj@fix.no |=20