From lojbab@lojban.org Fri Sep 01 17:56:48 2000 Return-Path: Received: (qmail 20601 invoked from network); 2 Sep 2000 00:56:45 -0000 Received: from unknown (10.1.10.27) by m3.onelist.org with QMQP; 2 Sep 2000 00:56:45 -0000 Received: from unknown (HELO stmpy-3.cais.net) (205.252.14.73) by mta2 with SMTP; 2 Sep 2000 00:56:45 -0000 Received: from bob (ppp51.net-A.cais.net [205.252.61.51]) by stmpy-3.cais.net (8.10.1/8.9.3) with ESMTP id e820utQ41140 for ; Fri, 1 Sep 2000 20:56:55 -0400 (EDT) (envelope-from lojbab@lojban.org) Message-Id: <4.2.2.20000901203826.00ac86c0@127.0.0.1> X-Sender: vir1036/pop.cais.com@127.0.0.1 X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.2 Date: Fri, 01 Sep 2000 20:54:01 -0400 To: Subject: Re: [lojban] Re: vowel counts In-Reply-To: <026201c01423$21e2daa0$22191bc1@rus.ger.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed From: "Bob LeChevalier (lojbab)" At 04:44 PM 09/01/2000 +0200, Daniel Gudlat wrote: >coi rodo >.i la jildicnen cu cusku di'e > > I wrote a short perl script to count the vowels in the gismu in the > > official word list and it came up with this count: > > > > ending i = 448 > > ending a = 335 > > ending u = 251 > > ending e = 158 > > ending o = 150 > > > > midvowel a = 510 > > midvowel i = 353 > > midvowel u = 201 > > midvowel e = 186 > > midvowel o = 92 > > > > total a = 845 > > total i = 801 > > total u = 452 > > total e = 344 > > total o = 242 > > > > 'a' and 'i' win by a fair margin over the others... i wonder why that > > is. > >Several possible reasons come to mind: > >a) vowel distribution in the source languages: I don't know anything >about the vowel distribution in Chinese, Hindu or Russian, but Arab only >has a, i, u, AFAIK. So this would tend to temper the English prevalence >of e quite a bit, I imagine. Ah, but the English prevalence of "e" is in spelling, and not in sound. Remember that the Lojban "e" maps only the SHORT e of "bet". The long "e" of "meet" is mapped to Lojban as "i", and the schwa of "the" is mapped as "y", but in gismu making was mapped as "a". >b) maximal separation of sounds: As far as vowels are concerned, a and i >(and u) are maximally separated and thus make for easier word >recognition in noisy environments. So this may have been a design >choice. Not a conscious factor. >c) Dipthongs: ai, ei, oi, and au are the lojban standard diphthongs and >strongly favor i and a. Not directly relevant, but close. In making gismu, we rewrote source language words using Lojban phonemes, and those 4 diphthongs are far more common than others. More importantly, pretty much all diphthongs have an "i" or "u", heightening those sound frequencies. >Any other takers? One other factor is that when we mapped Chinese to Lojban, I used a table found in a Chinese government publication on IPA mappings of the Chinese sounds. Relatively few Chinese sounds map to something in Lojban that contains an "o" so "o" in particular is underrepresented in the language. Instead, it mapped to schwa which we were habitually mapping to "a" at that point. The schwa mapping to "a" enhanced that letter's frequency and nearly killed "o" as a vowel, since most Lojban words get one if not both vowels from the Chinese. (English also does not use a "long "o" sound all that often). If we were doing the whole thing over again, we might make different rules on how to map sounds and spellings, and build gismu so as to heighten contrasts between sounds in the source language to maximize recognizability in written text, rather than merely mimicking pronunciation norms to maximize recognition of spoken sounds. But such a remake would be unthinkable at this point other than for intellectual curiosity (and anyone with the time for it probably has a million other things more worthwhile to do - even with very fast computers, this would be a long and tedious job. lojbab -- lojbab lojbab@lojban.org Bob LeChevalier, President, The Logical Language Group, Inc. 2904 Beau Lane, Fairfax VA 22031-1303 USA 703-385-0273 Artificial language Loglan/Lojban: http://www.lojban.org