Return-Path: Received: from SEGATE.SUNET.SE by xiron.pc.helsinki.fi with smtp (Linux Smail3.1.28.1 #1) id m0tFYzl-0000ZTC; Wed, 15 Nov 95 05:58 EET Message-Id: Received: from listmail.sunet.se by SEGATE.SUNET.SE (LSMTP for OpenVMS v1.0a) with SMTP id 815FAFA5 ; Wed, 15 Nov 1995 4:58:45 +0100 Date: Tue, 14 Nov 1995 22:57:03 -0500 Reply-To: Logical Language Group Sender: Lojban list From: Logical Language Group Subject: a better gismu algorithm? X-To: lojban@cuvmb.cc.columbia.edu To: Veijo Vilva Content-Length: 3435 Lines: 66 Mark Vines: >> I have things that I would do over if it were back in 1987, (or better, >> if we were starting over today, because remaking the gismu would take >> a fraction of the time today that it took then due to the increased >> computer power, and the algorith I would like to use would be far more >> compute intensive). > >What algorithm would you like to use, Lojbab, "if we were starting over >today"? Well, first, I would like to reconduct the tests that JCB says he conducted in selecting his algorithm, verifying that recognizability is not simply a hypothesis. The changes I would make would be more specific in how they tuned the Lojbanization to each language. Arabic should pay little value to vowel matches or consonant match reversals. English would accept any unstressed vowel where English uses a schwa. I have observed that English speakers DO get some recognizability out of consonant reversals. Initial and final consonants have higher recognition value; a single initial consonant match should be worth a lot for recognition - more than any 2-character match in the middle. The algorithm should give considerable value to substituting a voiced for an unvoiced consonant and vice versa if it results in a match. I would also be inclined to map the Chinese phonology less strictly according to IPA. As it is, we have few 'o's in Lojban primarily because the 'o's in Pinyin romanization seldom actually are realized as 'o', but rather as schwa. But I think mapping to 'o' would be better than to 'a', which is what we did because we mapped schwa to 'a' for other languages. For Russia, I would map to spelling rather than pronunication. That Russian devoice final consonants is irrelevant to Lojban, because when an ending is put on, the voicing is restored, and Lojban gismu do not end with a consonant. Clearly I don't have as simple, clear-cut algorithm. Just a realization that simple phoneme matches are not the best basis for recognition. >An important reason why Esperanto was so much more successful than >Volapuk at recruiting learners is that Esperanto uses words which are >mnemonic in many European languages -- with the occasional tidbit of >Arabic or Japanese. Lojban, like Loglan before it, uses the same >principle on a global scale. Hypothetically, that would give both Lojban >& Loglan (or any globally sourced conlang) the potential to be more >trans-culturally successful at recruitment than Esperanto. But it remains primarily a propaganda stunt, though it DOES result in words that seem pleasing to the ear due to the right irregular distribution of sounds. TLI Loglan was actually consider MORE mnemonic by the people who looked at it, than Lojban is. Why? Because English then had the dominant weight, more than Chinese, and the older Loglan gismu thus had higher resemblences to English. If we were selling this principle to English speakers, and English did not have the remnant of the British empire to make its population count so high, the Lojban would resemble a Hindi/Chinese meld, with some Spanish influence. As it is, because English incorporates so many Romance roots, it gets reinforced by Spanish and still has an undue effect on the final words. Look at the etymological matches between Lojban and Hindi to find out how close English SHOULD have been without that Spanish influence (which of course helps Lojban in Spanish resemblence as well. lojbab