Return-Path: Message-Id: <9112031855.AA19229@relay1.UU.NET> Date: Tue Dec 3 16:54:49 1991 Reply-To: Logical Language Group Sender: Lojban list From: Logical Language Group Subject: Lojban gismu etymologies - anyone interested? X-To: lojban@cuvmb.cc.columbia.edu X-Cc: doug@netcom.com To: John Cowan , Eric Raymond , Eric Tiedemann Status: RO X-From-Space-Date: Tue Dec 3 16:54:49 1991 X-From-Space-Address: cbmvax!uunet!CUVMA.BITNET!LOJBAN I have just finished the 2nd and nastiest step in building a record of the etymologies of the Lojban gismu. THis consisted of going through some 50 megabytes of output files to determine the runs that actually generated the words, and the source data, in Lojbanized form, for each of the runs. The resulting file is the first one that is really usable for tracking etymologies. Its main limitation is that the 6 source language entries for each word are in Lojbanized form, and you would therefore probably need to know the language in question to backtrack and figure out the actual Chinese or Russian word used, and you also need to recognize some (probably fairly obvious) conventions we used in Lojbanizing, like dropping of some declension suffixes to get the important part of the root. I've worked about 2 years off and on to get this far. The last step, getting actual source words into the file, will probably be several more years unless a local volunteer takes it on, since you have to go into the one-of-a-kind raw data notebook and hand enter all the words. On the other hand, I want to make the new etymology list available if anyone thinks they can use it. On disk, the step 1 file is 180K, consisting of all the gismu that were selected, as well as rejected choices due to word conflicts, lack o rafsi, and words that we dropped from the gismu list after doing the data runs. The step 2 file with the etymology for each of the words that was actually chosen is 280K. (Both files compress significantly.) Printed, the files would be some 40 pages and 80 pages respectively, and we'd have to charge 10c/page or more. {There will probably have to be some demand shown from the community if these files are to be made available on PLS. If not, we could still send the data on floppies, but I can;t afford to mail these things individually to people who request them. So let me know if you are interested. lojbab@grebyn.com