[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[lojban-beginners] Re: More lojban audio



coi cizra

Yes, they could be analised and used as the starting point for a text to speech system. This would entail chopping them up into several hundred bits, and doing some magic in Praat on each one. But, there really isn't much point in doing that because most of what we would need is already written in one form or anothe, so we'd just be re- inventing the wheel.

Speech, like many other problems, can be handled in two ways. There's the straight through approach, where we write one colossus of a program that takes a document in and spits out natural sounding speech. This is very un-portable however, and usually difficult to adjust. If we write a speech engine of this nature for English (for example) we have to start all over when we want one for Spanish or Lojban or whatever.

The other way to tackle the problem is called the modular approach. We write parts that do different things. There's a module that takes the written orthography of language X and converts it into some 1:1 representation of phonetics like Sampa or IPA or whatever. Then we have a module that analyses the structure of the sentences and clauses and produces the tonality (intonation) for the sentence. We could actually do without this module, we just get much more mechanical sounding output. Finally, we make a module that takes in IPA/Sampa and a tonality and produces a sound file of the pronunciation.

In the modular method, all that would need to be written is an orthography module for Lojban. Since Lojban doesn't formally define any tonality, we could just use pretty much anything; although Lojban speakers probably just use the one from their first language.

In short, there are many existing text to speech solutions already available. We are much better served making use of them then attempting to make a new one from the ground up for Lojban. Text to speech is easy -- the Commodore 64 had a text to speech program on it in the 70's or 80's. Quality natural text to speech is much harder.

mu'o mi'e .aleks.

On Jun 10, 2006, at 6:21 AM, elmo@haqq.pri.ee wrote:

On 20:20 Fri 09 Jun     , Alex Martini wrote:
Just finished recording a really long sample of the basic sounds of
Lojban, about 25 minutes in length. It is currently uploading to my
Can these sounds be used in speech synthesis? If I'm not wrong, Festival
and friends just concatenate digraphs (disounds?).
cizra
--
GPG public key: http://ttu.masendav.net/~t040673/pubkey