From lojbab Thu Sep 29 23:25:34 1994 Received: from access4.digex.net by nfs1.digex.net with SMTP id AA15096 (5.67b8/IDA-1.5 for ); Thu, 29 Sep 1994 23:25:31 -0400 Received: by access4.digex.net id AA02005 (5.67b8/IDA-1.5 for lojbab); Thu, 29 Sep 1994 23:25:26 -0400 Date: Thu, 29 Sep 1994 23:25:26 -0400 From: Logical Language Group Message-Id: <199409300325.AA02005@access4.digex.net> To: rauch@CS.YALE.EDU Subject: Re: The lujvo-making algorithm Cc: lojbab@access.digex.net, lojban@cuvmb.cc.columbia.edu Status: RO ER>Is a lujvo that uses the three-letter rafsi whenever possible always the ER>best lujvo? I'm thinking specifically of cases where the only three-letter ER>rafsi has as many syllables (two) as the four- or five-letter one. The two ER>qualities I think you want to maximize in a lujvo are (1) shortness, ER>measured in number of syllables, and (2) recognizability, that is, ER>similarity to the corresponding tanru - in that order. (This is when you're ER>not writing poetry or otherwise paying much attention to the sound.) ER> ER> ER>Should sa'urmi'e (sarcu minde) not be sarcyminde, for example? vi'ecpe ER>(vitke cpedu) vitkycpe? You get more similarity without increasing the time ER>it takes to say them (or by increasing it a tiny amount). ER> ER>Then there's the issue of simple abstraction. nunklama seems to be ER>preferred over nunkla when not compounded with something else, why? I think ER>there's a good reason. kamblanu is better than kambla, even though you'd ER>save two phonemes and a syllable to boot. The policy is, of course, that "sa'urmi'e" and "sarcyminde", as well as "nunkla" and "nunklama" are (each pair) the same identical word - just spelled/pronounced differently. Which word-form you choose is thus a kind of dialect-choice. The dictionary will have each lujvo in the "best" form as determined by the lujvo scoring algorithm. If this is a shortened form, such as "sa'urmi'e" or "nunkla", then the definition will appear with this entry, but there will be a second entry for the fully unreduced form (sarcyminde and nunklama) pointing to the short form, but giving no definition, etc. Thus, if you see, or invent, a Lojban word, you can know that there is a way to find it in the dictionary if it is there - even if it is there in a different form - by breaking it apart into rafsi and using that to look up the unreduced form, which will either have the definition or point you to the shorter wordform that has the definition. I do not at this point intend to get into subjectove judgements such as youy have suggested where "nunklama" might be 'better' than "nunkla". Thus far we have mostly English native speakers and I suspect that decisions as to priorities will be biased. The current algorithm simpluy looks for the shortest word in all cases. lojbab