From jay.kominek@colorado.edu Tue Apr 23 19:32:31 2002 Return-Path: X-Sender: kominek@ucsub.colorado.edu X-Apparently-To: lojban@yahoogroups.com Received: (EGP: mail-8_0_3_1); 24 Apr 2002 02:32:31 -0000 Received: (qmail 91466 invoked from network); 24 Apr 2002 02:32:28 -0000 Received: from unknown (66.218.66.217) by m12.grp.scd.yahoo.com with QMQP; 24 Apr 2002 02:32:28 -0000 Received: from unknown (HELO ucsub.colorado.edu) (128.138.129.12) by mta2.grp.scd.yahoo.com with SMTP; 24 Apr 2002 02:32:28 -0000 Received: from ucsub.colorado.edu (kominek@ucsub.colorado.edu [128.138.129.12]) by ucsub.colorado.edu (8.11.6/8.11.2/ITS-5.0/student) with ESMTP id g3O2WS824926 for ; Tue, 23 Apr 2002 20:32:28 -0600 (MDT) Date: Tue, 23 Apr 2002 20:32:27 -0600 (MDT) To: lojban@yahoogroups.com Subject: Re: [lojban] cmavo frequency list In-Reply-To: <20020424002708.GA3992@twcny.rr.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE From: Jay Kominek X-Yahoo-Group-Post: member; u=20706630 X-Yahoo-Profile: jfkominek X-Yahoo-Message-Num: 14094 On Tue, 23 Apr 2002, Rob Speer wrote: > I seem to remember that there is so far no accurate list of the > frequencies with which each cmavo is used. Wee > So I wrote a script which would search Lojban text for cmavo, even in > compounds, and count up the frequency for each one. Out of curiousity, are you using jbofi'e or vlatai or something along those lines to handle the lexing? And, have you considered trying to include the IRC channel logs? > Another script found the 121 cmavo which were not used anywhere. Some of > these were expected (lau) while others were quite surprising that they > have gone unused (ro'e). And of course most of the MEX words are in > there, but they are important nonetheless. I'd like to point out (for what little it is worth), that I've used the following: ke'e ko'o ci'i mo'a ro'e ro'o - Jay Kominek Plus =C3=A7a change, plus c'est la m=C3=AAme chose