From sabren@manifestation.com Fri Jul 13 20:11:26 2001 Return-Path: X-Sender: sabren@manifestation.com X-Apparently-To: lojban@onelist.com Received: (EGP: mail-7_2_0); 14 Jul 2001 03:11:25 -0000 Received: (qmail 50609 invoked from network); 14 Jul 2001 03:11:25 -0000 Received: from unknown (10.1.10.26) by l7.egroups.com with QMQP; 14 Jul 2001 03:11:25 -0000 Received: from unknown (HELO mercury.sabren.com) (209.61.186.253) by mta1 with SMTP; 14 Jul 2001 03:11:25 -0000 Received: from localhost (sabren@localhost) by mercury.sabren.com (8.9.3/8.9.3) with ESMTP id XAA29901 for ; Fri, 13 Jul 2001 23:19:43 -0500 Date: Fri, 13 Jul 2001 23:19:43 -0500 (CDT) X-Sender: sabren@mercury.sabren.com To: lojban list Subject: columns 158-164 Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII From: Michal Wallace X-Yahoo-Message-Num: 8552 coi rodo I'm looking at the gismu list, and notice two columns of codes right after the english definitions and before the cross references. What do these mean? The first one almost looks like some sort of grouping: blanu, xunre, narju all share the code 1a.. The second one.. I thought I heard something about word frequency? I just wrote a little program to sort the list by that number.. The top comes out like: ('cusku', 'express ', '1h ', '872') ('tanru', 'phrase compoun', '1b ', '776') ('prenu', 'person ', '1k ', '632') ('gismu', 'root word ', '1b ', '554') ('djica', 'desire ', '3l ', '500') ('lujvo', 'affix compound', '1b ', '428') ('diklo', 'local ', '5d ', '426') ('klama', 'come ', '1g1', '399') ('bacru', 'utter ', '1h ', '386') ('djuno', 'know ', '1h ', '375') ('sumti', 'argument ', '1b2', '373') ('drata', 'other ', '2g ', '351') ('kumfa', 'room ', '2k ', '346') ('tavla', 'talk ', '1h ', '338') ('nanmu', 'man ', '1k ', '332') ('cmalu', 'small ', '1e ', '326') ('citka', 'eat ', '5c ', '320') ('barda', 'big ', '1e ', '318') I find it hard to believe tanru is a more common word than citka or barda, but these do seem to be "simple" lojban words.. But then again, the other end came out like: ('gluta', 'glove ', 'ao ', ' 0') ('pambe', 'pump ', 'a ', ' 0') ('kanji', 'calculate ', '7e ', ' 0') ('barja', 'bar ', 'ap ', ' 0') ('sigja', 'cigar ', 'a ', ' 0') ('xatsi', '1E-18 ', 'ae ', ' 0') ('petso', '1E15 ', 'ae ', ' 0') ('fanri', 'factory ', '8c ', ' 0') ('barna', 'mark ', 'a ', ' 0') ('tsina', 'stage ', '5g ', ' 0') Which definitely seem less common (or more culture-specific). Am I reading these two codes right? Where did they come from? Cheers, - Michal ---------------------------------------------------------------------- let me host you! http://www.sabren.com me: http://www.sabren.net ----------------------------------------------------------------------