Message-Id: <m0tFYzl-0000ZTC@xiron.pc.helsinki.fi>
Date:         Tue, 14 Nov 1995 22:57:03 -0500
Reply-To:     Logical Language Group <lojbab@ACCESS.DIGEX.NET>
Sender:       Lojban list <LOJBAN@CUVMB.BITNET>
From:         Logical Language Group <lojbab@ACCESS.DIGEX.NET>
Subject:      a better gismu algorithm?
To:           Veijo Vilva <veion@XIRON.PC.HELSINKI.FI>
Content-Length: 3435
Lines: 66

Mark Vines:
>> I have things that I would do over if it were back in 1987, (or better,
>> if we were starting over today, because remaking the gismu would take
>> a fraction of the time today that it took then due to the increased
>> computer power, and the algorith I would like to use would be far more
>> compute intensive).
>
>What algorithm would you like to use, Lojbab, "if we were starting over
>today"?

Well, first, I would like to reconduct the tests that JCB says he
conducted in selecting his algorithm, verifying that recognizability is
not simply a hypothesis.

The changes I would make would be more specific in how they tuned the
Lojbanization to each language.  Arabic should pay little value to vowel
matches or consonant match reversals.  English would accept any
unstressed vowel where English uses a schwa.  I have observed that
English speakers DO get some recognizability out of consonant reversals.

Initial and final consonants have higher recognition value; a single
initial consonant match should be worth a lot for recognition - more
than any 2-character match in the middle.

The algorithm should give considerable value to substituting a voiced
for an unvoiced consonant and vice versa if it results in a match.

I would also be inclined to map the Chinese phonology less strictly
according to IPA.  As it is, we have few 'o's in Lojban primarily
because the 'o's in Pinyin romanization seldom actually are realized as
'o', but rather as schwa.  But I think mapping to 'o' would be better
than to 'a', which is what we did because we mapped schwa to 'a' for
other languages.

For Russia, I would map to spelling rather than pronunication.  That
Russian devoice final consonants is irrelevant to Lojban, because when
an ending is put on, the voicing is restored, and Lojban gismu do not
end with a consonant.

Clearly I don't have as simple, clear-cut algorithm.  Just a realization
that simple phoneme matches are not the best basis for recognition.

>An important reason why Esperanto was so much more successful than
>Volapuk at recruiting learners is that Esperanto uses words which are
>mnemonic in many European languages -- with the occasional tidbit of
>Arabic or Japanese.  Lojban, like Loglan before it, uses the same
>principle on a global scale.  Hypothetically, that would give both Lojban
>& Loglan (or any globally sourced conlang) the potential to be more
>trans-culturally successful at recruitment than Esperanto.

But it remains primarily a propaganda stunt, though it DOES result in
words that seem pleasing to the ear due to the right irregular
distribution of sounds.  TLI Loglan was actually consider MORE mnemonic
by the people who looked at it, than Lojban is.  Why?  Because English
then had the dominant weight, more than Chinese, and the older Loglan
gismu thus had higher resemblences to English.  If we were selling this
principle to English speakers, and English did not have the remnant of
the British empire to make its population count so high, the Lojban
would resemble a Hindi/Chinese meld, with some Spanish influence.  As it
is, because English incorporates so many Romance roots, it gets
reinforced by Spanish and still has an undue effect on the final words.
Look at the etymological matches between Lojban and Hindi to find out
how close English SHOULD have been without that Spanish influence (which
of course helps Lojban in Spanish resemblence as well.

lojbab