From lojbab@lojban.org Tue May 20 19:50:29 2003 Return-Path: X-Sender: lojban-out@lojban.org X-Apparently-To: lojban@yahoogroups.com Received: (EGP: mail-8_2_6_6); 21 May 2003 02:50:28 -0000 Received: (qmail 65200 invoked from network); 21 May 2003 02:50:28 -0000 Received: from unknown (66.218.66.218) by m14.grp.scd.yahoo.com with QMQP; 21 May 2003 02:50:28 -0000 Received: from unknown (HELO digitalkingdom.org) (204.152.186.175) by mta3.grp.scd.yahoo.com with SMTP; 21 May 2003 02:50:28 -0000 Received: from lojban-out by digitalkingdom.org with local (Exim 4.12) id 19IJgR-0004cE-00 for lojban@yahoogroups.com; Tue, 20 May 2003 19:50:27 -0700 Received: from digitalkingdom.org ([204.152.186.175] helo=chain) by digitalkingdom.org with esmtp (Exim 4.12) id 19IJfp-0004bY-00; Tue, 20 May 2003 19:49:50 -0700 Received: with ECARTIS (v1.0.0; list lojban-list); Tue, 20 May 2003 19:49:47 -0700 (PDT) Received: from lakemtao04.cox.net ([68.1.17.241]) by digitalkingdom.org with esmtp (Exim 4.12) id 19IJfe-0004bM-00 for lojban-list@lojban.org; Tue, 20 May 2003 19:49:38 -0700 Received: from bob.lojban.org ([68.100.92.1]) by lakemtao04.cox.net (InterMail vM.5.01.04.05 201-253-122-122-105-20011231) with ESMTP id <20030521024908.EBMZ13930.lakemtao04.cox.net@bob.lojban.org> for ; Tue, 20 May 2003 22:49:08 -0400 Message-Id: <5.2.0.9.0.20030520223640.03052d80@pop.east.cox.net> X-Sender: rlechevalier@pop.east.cox.net X-Mailer: QUALCOMM Windows Eudora Version 5.2.0.9 Date: Tue, 20 May 2003 22:48:33 -0400 To: lojban-list@lojban.org Subject: [lojban] Re: Lojban wordlists In-Reply-To: <871xytcpv6.fsf@gmx.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed X-archive-position: 5340 X-ecartis-version: Ecartis v1.0.0 Sender: lojban-list-bounce@lojban.org Errors-to: lojban-list-bounce@lojban.org X-original-sender: lojbab@lojban.org Precedence: bulk X-list: lojban-list From: Robert LeChevalier Reply-To: lojbab@lojban.org X-Yahoo-Group-Post: member; u=1120595 X-Yahoo-Profile: lojbab X-Yahoo-Message-Num: 19820 At 10:41 PM 5/20/03 +0200, Arne Koewing wrote: >As I'm new to this list let me introduce myself: >I study computer science at the university of Oldenburg >in north Germany (so I believe my English isn't best ;-), >I've just started learning lojban and in parallel i will try out >lojban as an knowledge representation language (part of). >I want to write a parser for the wordlists as a first step. > >Does anybody has a description of the lojban wordlists ? >(gismu.txt,...) > >my observations for gismu.txt: >+-----------+-----+------+ >|type |start|length| >+-----------+-----+------+ >|gismu |0 |5 | >+-----------+-----+------+ >|rafsi1 |6 |3 | >+-----------+-----+------+ >|rafsi2 |10 |3 | >+-----------+-----+------+ >|rafsi3 |14 |3-4 | >+-----------+-----+------+ >|english |19 | | >+-----------+-----+------+ >|"clue" |40 | | >+-----------+-----+------+ >|description|61 | | >+-----------+-----+------+ >|??? |158 |2 | This pertained to the original Lojban textbook, and was the lesson number and subgroup in which that word was to be introduced. Words were assigned to lessons 1-9 and a lesson subgroup based on semantics. Unassigned words were in lesson "a" with or without a semantic subgrouping. Cowan's rewrite of the first 6 lessons of the draft textbook into 22, which are found on www.lojban.org, eliminated the tracking of words to lesson, but the gismu list was never republished after that point. >+-----------+-----+------+ >|??? |160 |4 | This number is the frequency of usage of the word in my then-corpus (1991 or 1992, I think), which included English language texts about Lojban, so that some words like gismu and sumti were very high. the idea was that people might want to learn the words that they were more likely to run into in discussion or usage of Lojban. Both of these were columnated to allow sorting on those columns, possibly for use in LogFlash, but also presumed to be useful for other purposes. >+-----------+-----+------+ >|info... |168 | | >+-----------+-----+------+ Included in this info are all the "cf's" which are crosslinks from a Roget-like analysis of semantic groupings of the gismu, I think originally done by Veijo Vilva, though I modified it heavily. -- lojbab lojbab@lojban.org Bob LeChevalier, President, The Logical Language Group, Inc. 2904 Beau Lane, Fairfax VA 22031-1303 USA 703-385-0273 Artificial language Loglan/Lojban: http://www.lojban.org