Message-Id: <m0qRYjI-000023C@xiron.pc.helsinki.fi>
Date:         Sat, 23 Jul 1994 00:29:59 -0400
Reply-To:     Logical Language Group <lojbab@ACCESS.DIGEX.NET>
Sender:       Lojban list <LOJBAN%CUVMB.bitnet@FINHUTC.hut.fi>
From:         Logical Language Group <lojbab@ACCESS.DIGEX.NET>
Subject:      Re: lojban thesaurus query
To:           Veijo Vilva <veion@XIRON.PC.HELSINKI.FI>
Content-Length: 1851
Lines: 34

JW> That's what the English-Lojban section of the dictionary
JW> is going to be, isn't it?  In effect a cross reference that
JW> catches every use of an English word in describing a place
JW> in a Lojban predicate, and lists them by English word.

Indeed.  Although we are editing the file generated by the automatic routines
that produced the current files.  One would not want 1491 entries for "is",
(rather useless), nor the slightly more useful "438" entries for "in" (but
can't drop them all - we need nenri among others), 166 for 'with' (kansa),
107 for 'material'  (most are probably somewhat relevant, but a combined
entry would be worth more than a 107 line repetittion of complete place
structures as the auto-process would generate), or 52 for 'reflects'  (now
we start getting into tougher judgement calls).

I am right now editing the first English keyword file, with 10000 lines/entries
corresponding to all usages with no nore than 20 occurances in the auto-process
file.  The more voluminous other entries will be weeded vigorously and combined
as needed to keep them short enough to be used.  Since LogFest (when I started)
I have doe around 1840 lines of 9500 total, and am doing more than 500 lines
per day and accelrating as I get used to my system.  I am hoping to have this
first list up on the ftp site for review/usage by the end of the month, and it
will probably be close to its current (1.9 Meg) length.

The cmavo list is somewhat smaller (600K raw) and the lujvo list E-entries
very large (4.4Meg unedited covering some 3000 Lojban lujvo, this will likely
be cut to 2-3Meg).

I'll let this serve as an ad hoc dictionary progress report, and also let
people know the scope of the dictionary, by the size of these files that
we are playing with.

Should be a Good Thing, from the way it is shaping up so far .o'acai

lojbab