From nobody@digitalkingdom.org Sun Mar 07 06:53:07 2010 Received: with ECARTIS (v1.0.0; list lojban-list); Sun, 07 Mar 2010 06:53:09 -0800 (PST) Received: from nobody by chain.digitalkingdom.org with local (Exim 4.71) (envelope-from ) id 1NoHqr-00024r-Ow for lojban-list-real@lojban.org; Sun, 07 Mar 2010 06:53:06 -0800 Received: from mail-wy0-f181.google.com ([74.125.82.181]) by chain.digitalkingdom.org with esmtp (Exim 4.71) (envelope-from ) id 1NoHqe-00023l-Ma for lojban-list@lojban.org; Sun, 07 Mar 2010 06:52:58 -0800 Received: by wyb39 with SMTP id 39so3049985wyb.40 for ; Sun, 07 Mar 2010 06:52:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=eJAFtZ0Vd3VzPzFgNLpsbJ761YwI1VG8GgQ/1bCtN9E=; b=sc5mVF6uolSbqlFcCkzfdGvGbOPzJU9D97A2olpDAimkl+6s+Q8cju0dfS1YEjSXmB SO/RWnMYXNnMLtQ3jDxtHvCjVIvfHwxx4CCN6nOtz1rJxpTFIoIhr2X9Vy9PB5BmLttI xW3zG5HXUuZ3gCsV2D0QLdHjsnX4Mp1fT9SsE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=Nuf7KZJL8G4nj5qx2A04Kz0Ri/wNfDpOjmvgL7uOHtHqsqAEJCPFGy1EXPOKB7lVD7 wju/sxekXSeXeU1TjR9P5+rx7fMW9v2eWj1fCmG41BV2NGMh/MZRZNfuUts5YPBxlaYO R+LNSEYHs4p3o5sBYCjWParwF4H1H/g35Cxc8= MIME-Version: 1.0 Received: by 10.216.88.14 with SMTP id z14mr1308411wee.129.1267973566087; Sun, 07 Mar 2010 06:52:46 -0800 (PST) In-Reply-To: <4B93AFC3.5060103@perpetuum-immobile.de> References: <20100307.005234.11213.1@webmail02.dca.untd.com> <4B93AFC3.5060103@perpetuum-immobile.de> Date: Sun, 7 Mar 2010 07:52:46 -0700 Message-ID: <702226df1003070652x78ae9ce0u5e6d59e806b94ef7@mail.gmail.com> Subject: [lojban] Re: More espeak words, please. From: Jonathan Jones To: lojban-list@lojban.org Content-Type: multipart/alternative; boundary=0016e6d7e925fdc18a048137195c X-archive-position: 17105 X-ecartis-version: Ecartis v1.0.0 Sender: lojban-list-bounce@lojban.org Errors-to: lojban-list-bounce@lojban.org X-original-sender: eyeonus@gmail.com Precedence: bulk Reply-to: lojban-list@lojban.org X-list: lojban-list --0016e6d7e925fdc18a048137195c Content-Type: text/plain; charset=ISO-8859-1 Okay, then it must be one of the sumti items. Based on my information, there are 4876 items, which includes the gismu and sumti for place of each gismu. For example, {klama} accounts for six items- {klama}, {lo klama}, {lo se klama}, {lo te klama}, {lo ve klama}, and {lo xe klama}. Subtracting the 1342 gismu from that list leaves 3,534 gismu sumti. The "gismu_places-espeak.zip" file Dag uploaded to http://www.lojban.org/tiki/valsi%20Sound%20Files, which contains only gismu sumti, has 3.527 items, which means either 7 of the items are missing audio, or the list has 7 extra gismu sumti. I checked my list of the 4786 items, and it has exactly 1342 gismu, so the error definitely lies somewhere in the gismu sumti. My mistake. Timos, would you be so kind as to run that comparison again, but this time run it against this list, instead of the gismu list? On Sun, Mar 7, 2010 at 6:53 AM, Timo Paulssen < timonator@perpetuum-immobile.de> wrote: > moorkids@juno.com wrote: > > If you have a list of just gismu you could separate them into different > word > > documents by first letter. Do a word count for each letter group and > compare > > it with a word count of just that letter gismu from a complete list (like > the > > one here: http://en.wiktionary.org/wiki/Index:Lojban/gismu ). Then > narrow > > down which gismu is missing alphabetically. (Theirs probably a much > faster > > way to do this but I don't know it) > > list all the gismu sound files without .mp3: > ls | grep '^.....\.mp3' | sed -e 's/.mp3//' > existing.txt > list all the gismu from the gismu list (could've been done easier, i think) > egrep '^ [a-z]{5}' /usr/share/lojban/gismu.txt | sed -e 's/^ //' -e 's/ > .*//'\ > > allgismu.txt > compare the lists: > diff -n existing.txt allgismu.txt > > no output. apparently every gismu is there: > > count the number of lines in every textfile: > wc -l *txt > > 1342 allgismu.txt > 1342 existing.txt > 2684 total > > mu'o mi'e timos > > -- mu'o mi'e .aionys. .i.a'o.e'e ko klama le bende pe denpa bu --0016e6d7e925fdc18a048137195c Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Okay, then it must be one of the sumti items. Based on my information,= there are 4876 items, which includes the gismu and sumti for place of each= gismu. For example, {klama} accounts for six items- {klama}, {lo klama}, {= lo se klama}, {lo te klama}, {lo ve klama}, and {lo xe klama}.
=A0
Subtracting the 1342 gismu from that list leaves 3,534 gismu sumti. Th= e "gismu_places-espeak.zip" file Dag uploaded to http://www.lojban.org/tiki/val= si%20Sound%20Files, which contains only gismu sumti, has 3.527 items, w= hich means either 7 of the items are missing audio, or the list has 7 extra= gismu sumti. I checked my list of the 4786 items, and it has exactly 1342 = gismu, so the error definitely lies somewhere in the gismu sumti. My mistak= e.
=A0
Timos, would you be so kind as to run that comparison again, but this = time run it against this list, instead of the gismu list?

On Sun, Mar 7, 2010 at 6:53 AM, Timo Paulssen <timo= nator@perpetuum-immobile.de> wrote:
list a= ll the gismu sound files without .mp3:
=A0ls | grep '^.....\.mp3'= ; | sed -e 's/.mp3//' > existing.txt
list all the gismu from the gismu list (could've been done easier, i th= ink)
=A0egrep '^ [a-z]{5}' /usr/share/lojban/gismu.txt | sed -e = 's/^ //' -e 's/ .*//'\
=A0> allgismu.txt
compare t= he lists:
=A0diff -n existing.txt allgismu.txt

no output. apparently every gis= mu is there:

count the number of lines in every textfile:
=A0wc -= l *txt

=A01342 allgismu.txt
=A01342 existing.txt
=A02684 total=

mu'o mi'e timos



--
mu'o mi'e .aionys.

.i.a'o.e'e ko klama le be= nde pe denpa bu

--0016e6d7e925fdc18a048137195c-- To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.