From nicholas@uci.edu Wed Aug 08 12:40:19 2001
Return-Path: <nicholas@uci.edu>
X-Sender: nicholas@uci.edu
X-Apparently-To: lojban@yahoogroups.com
Received: (EGP: mail-7_2_0); 8 Aug 2001 19:40:19 -0000
Received: (qmail 77160 invoked from network); 8 Aug 2001 19:40:14 -0000
Received: from unknown (10.1.10.142) by l7.egroups.com with QMQP; 8 Aug 2001 19:40:13 -0000
Received: from unknown (HELO e4e.oac.uci.edu) (128.200.222.10) by mta3 with SMTP; 8 Aug 2001 19:40:13 -0000
Received: from localhost (nicholas@localhost) by e4e.oac.uci.edu (8.9.3/8.9.3) with ESMTP id MAA17483; Wed, 8 Aug 2001 12:40:13 -0700 (PDT)
X-Authentication-Warning: e4e.oac.uci.edu: nicholas owned process doing -bs
Date: Wed, 8 Aug 2001 12:40:13 -0700 (PDT)
X-Sender: <nicholas@e4e.oac.uci.edu>
To: <lojban@yahoogroups.com>
Cc: Nick NICHOLAS <nicholas@uci.edu>
Subject: unicode Digest V1 #134 (fwd)
Message-ID: <Pine.GSO.4.30.0108081236180.13874-100000@e4e.oac.uci.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=iso
Content-Transfer-Encoding: 8BIT
From: Nick NICHOLAS <nicholas@uci.edu>


With cultural fu'ivla and transliteration in the air again, I found the
following on the unicode mailing list timely.

--

Date: Tue, 7 Aug 2001 17:24:06 -0700 (PDT)
From: Kenneth Whistler <kenw@sybase.com>
Subject: Re: Names of languages each expressed in their own language

William Wolverington suggested:

> I wonder if there already exists, or could we devise, a list of the names of
> languages each expressed in their own language please.
>
> It would be helpful if the Unicode Consortium might kindly include such a
> list on its website, as that would then give the list considerable
> provenance for accuracy.

While it would be nice to have such a list easily available, I am sure
that the Unicode Consortium is not the right body to develop it, nor
the right website to post it.

Perhaps a very short subset of the potential list might be a useful
adjunct to the website of one of the major internationalization/
localization services companies, listing those languages that might
be most likely to be of widespread commercial significance for
translation, and hence for language menu choices.

[...]

However, the problem is enormously complex. In addition to the 6800+
living languages Peter mentions, there are also all the major and
minor extinct languages, each of which has at least some technical name
and maybe many other names in many other languages, some of which themselves
are extinct, of course. Among some of those we know, for example, would
be things like phrgios (written in Greek of course), which would be
the name in Classical Greek (extinct) of Phrygian (also extinct).

Then, no one can really tell "dialect" apart from "language", so you
end up with all the dialect names as well, and would have to sift
through that mass to figure out what to list.

Then there is the problem of just what a "language name" is in the
first place. This is anthropologically and sociologically tricky.
Many small aboriginal groups may not have had a "name" for their
language in the same sense that taxonomizing Europeans tended to
favor. What I speak may just be know as the "speech of the people"
or some such, and opposed to the "speech of hot-springs-village"
and the "speech of river-fork-village" and so on, referring to
groups around you by their village names or other geographic
references. Are those "language names"? Often such terms or pieces
of them get picked up by an anthropologist and are then asserted
to be the "name" of the language. Example: Wintu, a language
in North Central California: the word "wintu" just means "a person"
in Wintu, and wasn't used in the way we would use "English" for
designating a language, although you could translate "what people
speak" using it. In other instances a language name picked up by
somebody is actually somebody else's pejorative name for some
other group. The name sticks, but it isn't what the group itself
might use for itself (it's autonym).

In any case, as for nearly everything having to do with *language*
classification, as opposed to *character* classification, this whole
area is a black hole of effort that is essentially outside the
charter of the Unicode Consortium, in my opinion.

--Ken



