[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Wikichanges] Wiki page Word frequency lists changed by gleki
The page Word frequency lists was changed by gleki at 10:49 UTC
You can view the page by following this link:
http://www.lojban.org/tiki/Word%20frequency%20lists
You can view a diff back to the previous version by following this link:
http://www.lojban.org/tiki/tiki-pagehistory.php?page=Word%20frequency%20lists&compare=1&oldver=9&newver=10
***********************************************************
The changes in this version follow below, followed after by the current full page text.
***********************************************************
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
@@ -Lines: 1-4 changed to +Lines: 1-4 @@
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- ! Main lists (all words including cmavo clusters)<br />*[http://www.lojban.org/tiki/tiki-download_wiki_attachment.php?attId=934|Full list]
+ ! Main lists<br />*[http://www.lojban.org/tiki/tiki-download_wiki_attachment.php?attId=934|Full list (all words including cmavo clusters)]
*((Word Frequency Lists: gismu))
!How to generate lists yourself
***********************************************************
The new page content follows below.
***********************************************************
! Main lists
*[http://www.lojban.org/tiki/tiki-download_wiki_attachment.php?attId=934|Full list (all words including cmavo clusters)]
*((Word Frequency Lists: gismu))
!How to generate lists yourself
* See [https://groups.google.com/d/topic/lojban/KTPslnix3mQ/discussion|discussion] for details
* [http://www.lojban.org/corpus/corpus.txt.bz2|The Lojbanic corpus in a .tar.gz archive].
! Older stuff
* Older word frequencies can be found [http://www.lojban.org/files/roadmap.html#draft-dictionary_working|here]
* {file name=line-templates-by-frequency.txt showdesc=1} This is a sorted list of "sentence templates" excerpted from IRC. It shows which sequences of selma'o/word types are most common.
!! ((Robin Lee Powell))'s lists
[http://teddyb.org/~rlpowell/hobbies/lojban/flashcards/big_list|gismu and cmavo frequency ordered word list], based on Lojban IRC, Alice, and a few other large texts. There is also a [http://teddyb.org/~rlpowell/hobbies/lojban/flashcards/|large selection of intermediary files], including pure frequency lists
!! Rob Speer's lists
The following is about Rob Speer's frequency lists, which have
fallen off the 'net. Some of them have been recovered and attached
here.
The word frequency lists as of 2003/4/30. Stored on a separate server.
These frequency lists are drawn from a corpus containing the contents of the lojban.org/texts directory, most of this Wiki's ((texts in Lojban)), as many ((IRC)) logs as I could find, the texts on ((CVS)), and a large portion of the ((jbosnu)) archives. I spent some time weeding out most of the English text, and tried to avoid picking up metalinguistic discussion (a word frequency list based on the main mailing list showed that ((lujvo)) is one of the most commonly used words).
* {file name=freq_gismu.txt showdesc=1}
* {file name=freq_cmavo21.txt showdesc=1}
* BROKEN LINK: [http://takeneggs.com/lojban/compounds.txt|cmavo compounds]
* BROKEN LINK: [http://takeneggs.com/lojban/lujvo_freq.txt|lujvo] (updated 2003/7/12; non-lujvo removed; malformed almost-lujvo marked with *)
* BROKEN LINK: [http://takeneggs.com/lojban/fuhivla_freq.txt|fu'ivla]
* BROKEN LINK: [http://takeneggs.com/lojban/cmene_freq.txt|cmene]
mi'e ((rab.spir))
_______________________________________________
Wikichanges mailing list
Wikichanges@lojban.org
http://mail.lojban.org/mailman/listinfo/wikichanges