[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lojban] Request for a full frequency list of all lojbanic words for an Android app.



A quick search found this little gem for word frequencies

tr -cs A-Za-z\' '\n' | tr A-Z a-z | sort | uniq -c | sort -k1,1nr -k2

Running it against the corpus gives the attached frequencies. However don't use this freq list, as it includes many english words, abbreviations and author's names. Ideally one would clean the corpus of non-lojban and then run this script on it.

-- Ross

On 16 April 2013 17:36, la gleki <gleki.is.my.name@gmail.com> wrote:


On Monday, April 15, 2013 11:51:25 PM UTC+4, Robin Powell wrote:
On Fri, Apr 12, 2013 at 07:57:17AM -0700, la gleki wrote:
> peeps, i need ur help.
> we are gonna have Swype/Swipe feature for MultiLing android keyboard. I
> need a list of all lojbanic words + frequency of each.
> i know of a gismu frequency list. But it seems that not all gismu are there
> (less than 1342). What about cmavo, fu'ivla?
>
> Of course, most rare words can be given the lowest rating but what are the
> most frequent words?
> Can we rerun the algorithm to count all the occurrencies of all words?

http://users.digitalkingdom.org/~rlpowell/hobbies/lojban/flashcards/?C=M;O=D
-- the _freq lists should have everything.

It should be pretty easy to regenerate this stuff with the latest
from http://corpus.lojban.org/ , but I am (as usual) not
volunteering.

Is there a script that can generate such lists?
 

-Robin

--
You received this message because you are subscribed to the Google Groups "lojban" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lojban+unsubscribe@googlegroups.com.
To post to this group, send email to lojban@googlegroups.com.
Visit this group at http://groups.google.com/group/lojban?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "lojban" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lojban+unsubscribe@googlegroups.com.
To post to this group, send email to lojban@googlegroups.com.
Visit this group at http://groups.google.com/group/lojban?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Attachment: freq.bz2
Description: BZip2 compressed data