From nicholas@uci.edu Wed Aug 08 12:40:19 2001 Return-Path: X-Sender: nicholas@uci.edu X-Apparently-To: lojban@yahoogroups.com Received: (EGP: mail-7_2_0); 8 Aug 2001 19:40:19 -0000 Received: (qmail 77160 invoked from network); 8 Aug 2001 19:40:14 -0000 Received: from unknown (10.1.10.142) by l7.egroups.com with QMQP; 8 Aug 2001 19:40:13 -0000 Received: from unknown (HELO e4e.oac.uci.edu) (128.200.222.10) by mta3 with SMTP; 8 Aug 2001 19:40:13 -0000 Received: from localhost (nicholas@localhost) by e4e.oac.uci.edu (8.9.3/8.9.3) with ESMTP id MAA17483; Wed, 8 Aug 2001 12:40:13 -0700 (PDT) X-Authentication-Warning: e4e.oac.uci.edu: nicholas owned process doing -bs Date: Wed, 8 Aug 2001 12:40:13 -0700 (PDT) X-Sender: To: Cc: Nick NICHOLAS Subject: unicode Digest V1 #134 (fwd) Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=iso Content-Transfer-Encoding: 8BIT From: Nick NICHOLAS X-Yahoo-Message-Num: 9326 With cultural fu'ivla and transliteration in the air again, I found the following on the unicode mailing list timely. -- Date: Tue, 7 Aug 2001 17:24:06 -0700 (PDT) From: Kenneth Whistler Subject: Re: Names of languages each expressed in their own language William Wolverington suggested: > I wonder if there already exists, or could we devise, a list of the names of > languages each expressed in their own language please. > > It would be helpful if the Unicode Consortium might kindly include such a > list on its website, as that would then give the list considerable > provenance for accuracy. While it would be nice to have such a list easily available, I am sure that the Unicode Consortium is not the right body to develop it, nor the right website to post it. Perhaps a very short subset of the potential list might be a useful adjunct to the website of one of the major internationalization/ localization services companies, listing those languages that might be most likely to be of widespread commercial significance for translation, and hence for language menu choices. [...] However, the problem is enormously complex. In addition to the 6800+ living languages Peter mentions, there are also all the major and minor extinct languages, each of which has at least some technical name and maybe many other names in many other languages, some of which themselves are extinct, of course. Among some of those we know, for example, would be things like phrgios (written in Greek of course), which would be the name in Classical Greek (extinct) of Phrygian (also extinct). Then, no one can really tell "dialect" apart from "language", so you end up with all the dialect names as well, and would have to sift through that mass to figure out what to list. Then there is the problem of just what a "language name" is in the first place. This is anthropologically and sociologically tricky. Many small aboriginal groups may not have had a "name" for their language in the same sense that taxonomizing Europeans tended to favor. What I speak may just be know as the "speech of the people" or some such, and opposed to the "speech of hot-springs-village" and the "speech of river-fork-village" and so on, referring to groups around you by their village names or other geographic references. Are those "language names"? Often such terms or pieces of them get picked up by an anthropologist and are then asserted to be the "name" of the language. Example: Wintu, a language in North Central California: the word "wintu" just means "a person" in Wintu, and wasn't used in the way we would use "English" for designating a language, although you could translate "what people speak" using it. In other instances a language name picked up by somebody is actually somebody else's pejorative name for some other group. The name sticks, but it isn't what the group itself might use for itself (it's autonym). In any case, as for nearly everything having to do with *language* classification, as opposed to *character* classification, this whole area is a black hole of effort that is essentially outside the charter of the Unicode Consortium, in my opinion. --Ken