Return-Path: <@FINHUTC.HUT.FI:LOJBAN@CUVMB.BITNET> Received: from FINHUTC.hut.fi by xiron.pc.helsinki.fi with smtp (Linux Smail3.1.28.1 #1) id m0qqYd7-00005XC; Fri, 30 Sep 94 05:27 EET Message-Id: Received: from FINHUTC.HUT.FI by FINHUTC.hut.fi (IBM VM SMTP V2R2) with BSMTP id 2283; Fri, 30 Sep 94 05:27:41 EET Received: from SEARN.SUNET.SE (NJE origin MAILER@SEARN) by FINHUTC.HUT.FI (LMail V1.1d/1.7f) with BSMTP id 2280; Fri, 30 Sep 1994 05:27:41 +0200 Received: from SEARN.SUNET.SE (NJE origin LISTSERV@SEARN) by SEARN.SUNET.SE (LMail V1.2a/1.8a) with BSMTP id 4284; Fri, 30 Sep 1994 04:24:48 +0100 Date: Thu, 29 Sep 1994 23:25:26 -0400 Reply-To: Logical Language Group Sender: Lojban list From: Logical Language Group Subject: Re: The lujvo-making algorithm X-To: rauch@CS.YALE.EDU X-cc: lojban@cuvmb.cc.columbia.edu To: Veijo Vilva Content-Length: 2303 Lines: 40 ER>Is a lujvo that uses the three-letter rafsi whenever possible always the ER>best lujvo? I'm thinking specifically of cases where the only three-letter ER>rafsi has as many syllables (two) as the four- or five-letter one. The two ER>qualities I think you want to maximize in a lujvo are (1) shortness, ER>measured in number of syllables, and (2) recognizability, that is, ER>similarity to the corresponding tanru - in that order. (This is when you're ER>not writing poetry or otherwise paying much attention to the sound.) ER> ER> ER>Should sa'urmi'e (sarcu minde) not be sarcyminde, for example? vi'ecpe ER>(vitke cpedu) vitkycpe? You get more similarity without increasing the time ER>it takes to say them (or by increasing it a tiny amount). ER> ER>Then there's the issue of simple abstraction. nunklama seems to be ER>preferred over nunkla when not compounded with something else, why? I think ER>there's a good reason. kamblanu is better than kambla, even though you'd ER>save two phonemes and a syllable to boot. The policy is, of course, that "sa'urmi'e" and "sarcyminde", as well as "nunkla" and "nunklama" are (each pair) the same identical word - just spelled/pronounced differently. Which word-form you choose is thus a kind of dialect-choice. The dictionary will have each lujvo in the "best" form as determined by the lujvo scoring algorithm. If this is a shortened form, such as "sa'urmi'e" or "nunkla", then the definition will appear with this entry, but there will be a second entry for the fully unreduced form (sarcyminde and nunklama) pointing to the short form, but giving no definition, etc. Thus, if you see, or invent, a Lojban word, you can know that there is a way to find it in the dictionary if it is there - even if it is there in a different form - by breaking it apart into rafsi and using that to look up the unreduced form, which will either have the definition or point you to the shorter wordform that has the definition. I do not at this point intend to get into subjectove judgements such as youy have suggested where "nunklama" might be 'better' than "nunkla". Thus far we have mostly English native speakers and I suspect that decisions as to priorities will be biased. The current algorithm simpluy looks for the shortest word in all cases. lojbab