From nobody@digitalkingdom.org Thu Jul 17 09:32:54 2008 Received: with ECARTIS (v1.0.0; list lojban-list); Thu, 17 Jul 2008 09:32:55 -0700 (PDT) Received: from nobody by chain.digitalkingdom.org with local (Exim 4.69) (envelope-from ) id 1KJWPW-00044X-Jo for lojban-list-real@lojban.org; Thu, 17 Jul 2008 09:32:54 -0700 Received: from mail.gmx.net ([213.165.64.20]) by chain.digitalkingdom.org with smtp (Exim 4.69) (envelope-from ) id 1KJWPO-00043j-AN for lojban-list@lojban.org; Thu, 17 Jul 2008 09:32:54 -0700 Received: (qmail invoked by alias); 17 Jul 2008 16:32:35 -0000 Received: from DSL01.83.171.145.8.ip-pool.NEFkom.net (EHLO [192.168.2.8]) [83.171.145.8] by mail.gmx.net (mp059) with SMTP; 17 Jul 2008 18:32:35 +0200 X-Authenticated: #5251800 X-Provags-ID: V01U2FsdGVkX1+Bl5AbnviBt9ZkJgreudkC3BCCaPwsPUn3RAUFOV U1N6VG3fiD3wej Message-ID: <487F72D5.8010308@gmx.net> Date: Thu, 17 Jul 2008 18:27:01 +0200 From: Chris Hammerschmidt User-Agent: Thunderbird 2.0.0.14 (X11/20080621) MIME-Version: 1.0 To: lojban-list@lojban.org Subject: [lojban] Re: Lojban Speech Recognition semester-project References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Y-GMX-Trusted: 0 X-FuHaFi: 0.57 X-Spam-Score: -0.0 X-Spam-Score-Int: 0 X-Spam-Bar: / X-archive-position: 14612 X-ecartis-version: Ecartis v1.0.0 Sender: lojban-list-bounce@lojban.org Errors-to: lojban-list-bounce@lojban.org X-original-sender: chrham@gmx.net Precedence: bulk Reply-to: lojban-list@lojban.org X-list: lojban-list Coi I'm not sure if you are aware of http://jbobac.lojban.org/ which has some examples with transcript. A series of words (with transscript) can be found at http://allalone.org/cizra/ which is intended as a pronunciation guide. mu'o mi'e .laxris Nico Möller wrote: > Random sentences are quite ok, we ourself recorded some sentences from > Alice, but send us whatever you have got, as long we got a transcript of > what was uttered it would be totally sufficient. > > I know that uncompressed audio files are quite big, but hey its only 16bit > mono and of course you can compress them using zip, 7z or whatever you like > ;). I think then it shold be no Problem to send them via mail. Or you can > use some free filehosting on the web and send us the links. Just be > creative... If none of theses methods should be appropriable just send them > in a format (mp3, etc.) we can convert back into wavs... > > Thanks a lot for your help, > Nico > > On Thu, Jul 17, 2008 at 12:36 PM, james riley wrote: > >> Random sentences okay or should they be part of a bigger prose? I could >> churn out loads tomorrow (unless something happens), but I'm afk today to >> help out at my uni. My pronunciation needs practise, but is mostly okay. >> Also, wav is very big, how do you want us to send you loads of recordings in >> wav? >> >> 2008/7/16 Nico Möller : >> >> Hi guys, >>> We have got a request a hopefully some of you are willing to help us. We >>> are currently studying cognitive science at the university of osnabrueck and >>> participating in a course called "practical natural language processing", >>> which is some kind of semester project in lingusitics. Our group decided to >>> deal with some speech recognition and because lojban has so nice phonetic >>> features we choose it as our target language, Unfortunately we discovered >>> that there is very few (usable) lojban audio data on the web, but we >>> actually need huge amounts of them to feed our training algorithms. It would >>> be really cool if some of you could actually send us some audio data we can >>> work with, if you do so please provide them in the following format: >>> >>> - 16bit mono, 16khz >>> - preferable raw or wav data files >>> - one sentence per audio file >>> - a transcript text file containing one sentence per line + the name of >>> the audio file in which the sentence was uttered >>> >>> Everybody who sends as applicable data will be mentioned by name in our >>> final term paper, which will be published at the end of this month (You see >>> will really need those data quick). >>> >>> Thanks a lot for your effort, >>> Nico & Thorben -- e'osai ko sarji la lojban. http://lojban.org Please! Support Lojban. To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.