From nobody@digitalkingdom.org Tue Aug 26 08:21:18 2008 Received: with ECARTIS (v1.0.0; list lojban-beginners); Tue, 26 Aug 2008 08:21:19 -0700 (PDT) Received: from nobody by chain.digitalkingdom.org with local (Exim 4.69) (envelope-from ) id 1KY0M8-0007A1-TR for lojban-beginners-real@lojban.org; Tue, 26 Aug 2008 08:21:18 -0700 Received: from mail-gx0-f19.google.com ([209.85.217.19]) by chain.digitalkingdom.org with esmtp (Exim 4.69) (envelope-from ) id 1KY0Ls-00076f-DD for lojban-beginners@lojban.org; Tue, 26 Aug 2008 08:21:16 -0700 Received: by gxk12 with SMTP id 12so3129167gxk.10 for ; Tue, 26 Aug 2008 08:20:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:cc:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references; bh=bbT0k96N25Y7eDEx4Rm43YS46gXA1rPhol+0rBvMYcQ=; b=CHsdMjUVyNuzXiL8jpKu5Ue2ofINNyAkqkYBxbY5NGyKBgYPVK45Cj+LQw1fZieMDP gcJSkZXkorDG/owOs66QS4zY9u+qYt0QZRiroBn35yB5S1zyYLrIotVsULqdSnKywnD/ SX6cDCjR/p+ZwaU+u4iTV0Lha2xR2zQFnFyWE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=SKI9GrLVujC3QIvPk9aEwAd8HBTk4ykKCAALjM4g9Kp2r6EyEIN9TOdbecxVRvr7Rm 9jlc0HDVaPAWG8LnGsq3z22lq8vY+n2zwXH3nuKB3iN7t5b1CEE+UyjJn8AtFqzx8GvZ Jj3PTtGT6Pn4lUafivppNiq7HBgEwFVF8/XQg= Received: by 10.150.146.1 with SMTP id t1mr5896291ybd.176.1219764053969; Tue, 26 Aug 2008 08:20:53 -0700 (PDT) Received: by 10.150.218.18 with HTTP; Tue, 26 Aug 2008 08:20:53 -0700 (PDT) Message-ID: Date: Tue, 26 Aug 2008 08:20:53 -0700 From: "Stephen Pollei" To: "Jonathan Duddington" Subject: [lojban-beginners] Re: espeak text to speech for lojban Cc: lojban-beginners@lojban.org In-Reply-To: <4fd4ef447bjonsd@jsd.clara.co.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <4fd4ef447bjonsd@jsd.clara.co.uk> X-Spam-Score: -0.0 X-Spam-Score-Int: 0 X-Spam-Bar: / X-archive-position: 846 X-ecartis-version: Ecartis v1.0.0 Sender: lojban-beginners-bounce@lojban.org Errors-to: lojban-beginners-bounce@lojban.org X-original-sender: stephen.pollei@gmail.com Precedence: bulk Reply-to: lojban-beginners@lojban.org X-list: lojban-beginners On 8/26/08, Jonathan Duddington wrote: > On 26 Aug, Stephen Pollei wrote: > > for jbo_list the fix was easy I simply deleted most of it. > > > was all that was maybe needed. lojban should be a phometic language > > so all the other things probably was just causing issues. > You need the entries in jbo_list for the consonant letters "b", "t", > etc. If text contains a single consonant letter (as a one-letter > word), it needs an added vowel to pronounce it. Otherwise (with some > letters) you hear only a click. Sure but lojban is suppose to be phometic and AVI(audio Visual Isomorphic) if a writter didn't put a y after by or a bu after a , then it's not espeak's task to do so. > > > > for jbo_rules I simply added pausing rules for things that begin with > > vowels. dj and tc shouldn't need special rules. > > > Translating "tc" to phonemes [t] + [S] is intelligible, but I expect > that the single phoneme [tS] sounds better. Try it, and compare. Yes I didn't test it at all. And I will have to have others tests as well. > > gi'V shouldn't need > > extra help . l and r should need the extra stuff they have and n > > doesn't change based on if g or k follows it. the pausing rules and > > word boundary rules for lojban are probably complex enough that a > > special front end should be created to split words and insert > > mandatory pauses(".") were needed. > > > Can you suggest a set of rules? I can refer you to some rules, I'm in hurry right now. I did notice that if I add pause after words that end in consonat that is great improvement. http://www.lojban.org/tiki/tiki-download_wiki_attachment.php?attId=195 The Hills Are Alive With The Sounds Of Lojban http://www.lojban.org/tiki/tiki-download_wiki_attachment.php?attId=194 # The Shape Of Words To Come: Lojban Morphology both are chapters from http://www.lojban.org/tiki/tiki-index.php?page=The+Lojban+Reference+Grammar&bl also over night I thought that a pause after vocatives and a few others places where cmene show up would help even if it sometimes puts pauses where they aren't needed. la, lai, doi, coi, co'o putting pause after them might be useful. > > > > Also stress in lojban is based on brivla Vs cmavo Vs. cmene , > > > Sorry, but I'm not familiar with most of the Lojban technical terms > such as "brivla". yes it's in word morphology , basicly unless it's more than two syllables, have consonant cluster within first 5 letters, and ends in a vowel then it's unstressed by default . However AEIOU are the stressed vowel markers . I wouldn't worry about it for now. also dotside says all cmene should have a pause before them. don't worry to much about that for now. > > > > and if a capital letter is put in there. > > special front end should probably do stress markings as well. > > > Truth be told I didn't compile or test any of these changes and I > > wanted opinion from some who are more knowlegdable than me to comment > > and test. > > > Experiment. Make changes, compile the data, and listen to how it > sounds. Yes I'm not done I can see how much I can do without putting in a preprocesor or changing your c++ code. I will also need some help from a few people that know lojban better than me. I think I have a few ideas though. It's still very much a wip . > For spoken language we need to pause in suitable places. For eSpeak > English, this includes at commas, brackets, quotation marks, and before > some words (usually conjunctions). Also some common function words > should be unstressed so that the speech flows in a more natural way. yes in lojban only the period is a pause. and it has it's own pecular stress rules. a comma can be used to show syllable boundaries. brackets, qoutes,etc have words not pauses associated with them. > > Lojban doesn't use commas, but it should be possible to identify > clauses, and therefore put pauses in the appropriate places. > The rule for "gi'" + vowel is an attempt to identify an end-of-clause. yes that might be helpful, even if the lojban rules don't stricly need them. I will ask what the opinion of the lojban experts are on that.