From nobody@digitalkingdom.org Sat Nov 01 17:28:28 2008 Received: with ECARTIS (v1.0.0; list lojban-list); Sat, 01 Nov 2008 17:28:29 -0700 (PDT) Received: from nobody by chain.digitalkingdom.org with local (Exim 4.69) (envelope-from ) id 1KwQpQ-0003TM-RL for lojban-list-real@lojban.org; Sat, 01 Nov 2008 17:28:28 -0700 Received: from eastrmmtao106.cox.net ([68.230.240.48]) by chain.digitalkingdom.org with esmtp (Exim 4.69) (envelope-from ) id 1KwQpO-0003Ss-9f for lojban-list@lojban.org; Sat, 01 Nov 2008 17:28:28 -0700 Received: from eastrmimpo01.cox.net ([68.1.16.119]) by eastrmmtao106.cox.net (InterMail vM.7.08.02.01 201-2186-121-102-20070209) with ESMTP id <20081102002804.LCBI4226.eastrmmtao106.cox.net@eastrmimpo01.cox.net> for ; Sat, 1 Nov 2008 20:28:04 -0400 Received: from [192.168.1.100] ([70.187.235.94]) by eastrmimpo01.cox.net with bizsmtp id a0U21a00E22sj6m020U2Y2; Sat, 01 Nov 2008 20:28:04 -0400 X-Authority-Analysis: v=1.0 c=1 a=pNadMe8njYEA:10 a=8YJikuA2AAAA:8 a=ODiCMmZ3xwJpUEBC4GYA:9 a=Yl1La9iUVGokakuDoLcA:7 a=hpPBYxBGU9bN9ijcetOl_Mbck9wA:4 a=Wm3aFAPTvykA:10 a=pt045V2O6wwA:10 X-CM-Score: 0.00 Message-ID: <490CF43A.9030204@lojban.org> Date: Sat, 01 Nov 2008 20:28:42 -0400 From: Bob LeChevalier User-Agent: Mozilla Thunderbird 1.0.7 (Windows/20050923) X-Accept-Language: en-us, en MIME-Version: 1.0 To: lojban-list@lojban.org Subject: [lojban] Re: Sources for luj1999? References: <20081101214236.GI2447@nvg.org> In-Reply-To: <20081101214236.GI2447@nvg.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: 0.0 X-Spam-Score-Int: 0 X-Spam-Bar: / X-archive-position: 14894 X-ecartis-version: Ecartis v1.0.0 Sender: lojban-list-bounce@lojban.org Errors-to: lojban-list-bounce@lojban.org X-original-sender: lojbab@lojban.org Precedence: bulk Reply-to: lojban-list@lojban.org X-list: lojban-list Arnt Richard Johansen wrote: > http://www.lojban.org/publications/draft-dictionary/Working/luj1999.ZIP > > This file contains lujvo that have been automatically excerpted from texts, semi-automatically converted into their canonical forms. It also contains frequency counts of this words. > > What I would like to know is which source texts have been used, and if they are available somewhere. > > To take a specific example, consider this line: > > (2) cevyspe god+married canonical form=ceispe > > This apparently means that the word "cevyspe" was used two times in the corpus. But a web search turns up nothing for "cevyspe", save an older word frequency list: > > http://www.lojban.org/publications/wordlists/frequencies2.txt > > What do I need to have to make sure that I have the context for every word that occurs in luj1999.zip? You can ask lojbab, who MAY be able to find it in his archives, since he was the one who created the list %^) In this case, the word was actually used only once but appeared in two different Lojban List messages reporting results of an early "phone game" These are the two messages headers: From bennetto-jack@CS.YALE.EDU Thu Sep 5 17:05:22 1991 Date: Thu, 5 Sep 91 16:42:03 EDT From: Jack Bennetto To: lojbab@grebyn.com Subject: lojbanic telephone Date: Wed, 13 Nov 1991 20:22:39 EST Sender: Lojban list From: Jack Bennetto Subject: lojbanic telephone X-To: lojban@cuvmb.cc.columbia.edu To: Bob LeChevalier Here is the usage: > Sentence 4: > > Each day Sister Margeret asked who had been able to do the > homework. > > ca ra le pu djedi la margaret. poi cevyspe cu te preti > loi (?) nu ma snada danfu fo le zdatelcli > > Yesterday, Sister Margaret asked, "What are the secrets to successful > housekeeping?" It appears that "cevyspe" is intended to be the titular address for a nun. Possibly not the most obvious meaning to someone unfamiliar with the culture. lojbab To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.