Message-Id: <m0qqYd7-00005XC@xiron.pc.helsinki.fi>
Date:         Thu, 29 Sep 1994 23:25:26 -0400
Reply-To:     Logical Language Group <lojbab@ACCESS.DIGEX.NET>
Sender:       Lojban list <LOJBAN%CUVMB.bitnet@FINHUTC.hut.fi>
From:         Logical Language Group <lojbab@ACCESS.DIGEX.NET>
Subject:      Re: The lujvo-making algorithm
To:           Veijo Vilva <veion@XIRON.PC.HELSINKI.FI>
Content-Length: 2303
Lines: 40

ER>Is a lujvo that uses the three-letter rafsi whenever possible always the
ER>best lujvo? I'm thinking specifically of cases where the only three-letter
ER>rafsi has as many syllables (two) as the four- or five-letter one. The two
ER>qualities I think you want to maximize in a lujvo are (1) shortness,
ER>measured in number of syllables, and (2) recognizability, that is,
ER>similarity to the corresponding tanru - in that order. (This is when you're
ER>not writing poetry or otherwise paying much attention to the sound.)
ER>   ER>
ER>Should sa'urmi'e (sarcu minde) not be sarcyminde, for example? vi'ecpe
ER>(vitke cpedu) vitkycpe? You get more similarity without increasing the time
ER>it takes to say them (or by increasing it a tiny amount).
ER>
ER>Then there's the issue of simple abstraction. nunklama seems to be
ER>preferred over nunkla when not compounded with something else, why? I think
ER>there's a good reason. kamblanu is better than kambla, even though you'd
ER>save two phonemes and a syllable to boot.

The policy is, of course, that "sa'urmi'e" and "sarcyminde", as well as "nunkla"
and "nunklama" are (each pair) the same identical word - just spelled/pronounced
differently.  Which word-form you choose is thus a kind of dialect-choice.

The dictionary will have each lujvo in the "best" form as determined by the
lujvo scoring algorithm.  If this is a shortened form, such as "sa'urmi'e"
or "nunkla", then the definition will appear with this entry, but there will
be a second entry for the fully unreduced form (sarcyminde and nunklama)
pointing to the short form, but giving no definition, etc.  Thus, if you see,
or invent, a Lojban word, you can know that there is a way to find it in the
dictionary if it is there - even if it is there in a different form - by
breaking it apart into rafsi and using that to look up the unreduced form,
which will either have the definition or point you to the shorter wordform
that has the definition.

I do not at this point intend to get into subjectove judgements such as youy
have suggested where "nunklama" might be 'better' than "nunkla".  Thus far we
have mostly English native speakers and I suspect that decisions as to
 priorities will be biased.  The current algorithm simpluy looks for the
 shortest word
in all cases.

lojbab