[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lojban] Re: More espeak words, please.



moorkids@juno.com wrote:
> If you have a list of just gismu you could separate them into different word
> documents by first letter.  Do a word count for each letter group and compare
> it with a word count of just that letter gismu from a complete list (like the
> one here: http://en.wiktionary.org/wiki/Index:Lojban/gismu ).  Then narrow
> down which gismu is missing alphabetically.  (Theirs probably a much faster
> way to do this but I don't know it)

list all the gismu sound files without .mp3:
  ls | grep '^.....\.mp3' | sed -e 's/.mp3//' > existing.txt
list all the gismu from the gismu list (could've been done easier, i think)
  egrep '^ [a-z]{5}' /usr/share/lojban/gismu.txt | sed -e 's/^ //' -e 's/ .*//'\
  > allgismu.txt
compare the lists:
  diff -n existing.txt allgismu.txt

no output. apparently every gismu is there:

count the number of lines in every textfile:
  wc -l *txt

 1342 allgismu.txt
 1342 existing.txt
 2684 total

mu'o mi'e timos

Attachment: signature.asc
Description: OpenPGP digital signature