From marob!hombre!uunet!cbmvax!snark!lojbab Mon Dec 10 07:47:25 1990
Received: by magpie.MASA.COM (smail2.5)
	id AA17303; 10 Dec 90 07:47:23 EST (Mon)
Received: by marob.uucp (/\=-/\ Smail3.1.18.1 #18.1)
	id <m0iimjk-0002SJC@marob.uucp>; Mon, 10 Dec 90 07:36 EST
Received: by hombre.MASA.COM (smail2.5)
	id AA06222; 10 Dec 90 07:17:14 EST (Mon)
Received: from cbmvax.UUCP by uunet.UU.NET (5.61/1.14) with UUCP 
	id AA24448; Sat, 1 Dec 90 22:43:35 -0500
Received: by cbmvax.cbm.commodore.com (5.57/UUCP-Project/Commodore Jan 13 1990)
	id AA27690; Sat, 1 Dec 90 22:35:18 EST
Received: by snark.thyrsus.com (/\=-/\ Smail3.1.18.1 #18.14)
	id <m0ifk1s-00025fC@snark.thyrsus.com>; Sat, 1 Dec 90 22:06 EST
Received: by snark.thyrsus.com (/\=-/\ Smail3.1.18.1 #18.14)
	id <m0icv2S-0001NxC@snark.thyrsus.com>; Sat, 24 Nov 90 03:15 EST
Message-Id: <m0icv2S-0001NxC@snark.thyrsus.com>
Date: Sat, 24 Nov 90 03:15 EST
From: marob!uunet!cbmvax!snark.thyrsus.com!lojbab (Bob LeChevalier)
To: cowan@marob.masa.com, eubanks@uhunix.uhcc.hawaii.edu,
        jeannec@well.sf.ca.us
Subject: 2nd try at Prothero discussion re xmit
Status: RO

Enclosed are the three messages containing the essence of the discussion
of the last month.  Jeanne apparently got none of these, and Brian perhaps
only some of them.  John Cowan, who will normally handle requests for back
messages could not get these back since it had been too long, so I am sending
him a copy in case the topic comes up, and to include in a file on Lojban
and Esperanto we will be making available by ftp file server.

Date: Fri, 12 Oct 90 12:09:27 -0700
From: Jeff Prothero <cbmvax!uunet!milton.u.washington.edu!jsp>
To: lojban-list@snark.thyrsus.com
Subject: Book review (long)

I've been poking through the Linguistics section of the campus
library, and found a book which might interest other Loglanists:

 Trends in Linguistics
 Studies and Monographs 42:

   Interlinguistics
 Aspects of the Science of Planned Languages

  Klaus Schubert (Ed.)

 Mouton de Gruyter 1989 ISBN 3-11-011910-2

The book is 350 pages, in print, and costs $45 in Seattle.

"This book ... is an invitation to all those interested in languages
and linguistics to make themselves acquainted with some recent streams
of scientific discussion in the field of planned languages."

The book is a collection of fifteen recent papers in interlinguistics.
For folks who (like me) haven't been following the field, the
bibliographies provide an up-to-date set of pointers into the
literature, plus some overviews of it.  I think the table of contents
gives an adequate idea of the scope and focus of the book:

--------------------------------------------------------------------

Table of contents:

Part I: I Introductions

Andre Martinet: The proof of the pudding
Klaus Schubert: Interlinguistics - its aims, its achievements,
                and its place in language science.

Part II: Planned Languages in Linguistics
Aleksandr D Dulicenko: Ethnic language and planned language.
Detlev Blanke: Planned languages - a survey of some of the
               main problems.
Sergej N Kuznecov: Interlinguistics: a branch of applied linguistics?

Part III: Languages Design and Language Change

Dan Maxwell: Principles for constructing Planned Languages
Francois Lo Jacomo: Optimization in language planning
Claude Piron: A few notes on the evolution of Esperanto

Part IV: Sociolinguistics and Psycholinguistics

Jonathan Pool - Bernard Grofman: Linguistic artificiality
              and cognitive competence
Claude Piron: Who are the speekers of Esperanto
Tazio Carlevaro: Planned auxiliary language and communicative
                 competence.

Part V: The Language of Literature
Manuel Halvelik: Planning nonstandard language
Pierre Janton: If Shakespeare had written in Esperanto

Part VI: Grammar
Probal Dasgupta: Degree words in Esperanto and categories
                 in Universal Grammar
Klaus Schubert: An unplanned development in planned languages.

Part VII: Terminology and Computational Lexicography

Wera Blanke: Terminological standardization - its roots and
             fruits in planned languages
Rudiger Eichholz: Terminics in the interethnic language
Victor Sadler: Knowledge-driven terminography for machine translation

Index

--------------------------------------------------------------------

I'm not a linguist, and won't attempt to review the book from a
linguistics point of view, but I will highlight some points of
particular interest to Loglanists:

First, there is some mention of Loglan (and the thousand-odd other
artificial language projects to date), but the bulk of the focus is on
Esperanto, for the simple reason that 99.9% of fluent planned-language
users speak Esperanto, and a similar percentage of the written-text
corpus from the planned language community is in Esperanto.  (Any
Loglanists who cannot tolerate mention of That Language are invited to
stop reading at this point. :-)

Second, I (and perhaps most Loglanists) was unaware of the Distributed
Language Translation project, which seems to be of considerable
potential interest to Loglanists.  Quoting copyrighted material
without permission:

"Distributedd Language Translation is the name of a long-term research
and development project carried out by the BSO software house in
Utrecht with funding from the Netherlands Ministry of Economic
Affairs.  For the present seven year period (1985-1991) it has a
budget of 17 million guilders... Although much larger in size than
earlier attempts, DLT started off as just another project of the
second stage, using Esperanto as its intermediate language.  Esperanto
had been judged suitable for this purpose because of its highly
regular syntax and morphology and because its agglutinative nature
promised an especially efficient possibility of morpheme-based coding
of messages for network transmission.  During the course of the first
years of the large-scale practical development, however, the role of
Esperanto in the DLT system increased substantially.  the intermediate
language took over more and more processes originally designed to be
carried out either in the source or in the target languages of the
multilingual system.  When I consider the DLT system to be one step
more highly developed than the earlier implementations involving
Esperanto, it is because the increase in the role of Esperanto was due
to intrinsic qualities of Esperanto as a planned language.  In other
words, Esperanto is in DLT no longer treated as any other language
(which incidentally has a somewhat more computer-friendly grammar than
other languages), but it is now used in DLT for a large part of the
overall translation process _because of its special features as a
planned language_.  Some facets of this complex application are
discussed by Sadler (in this volume.)

"The functions fulfilled in DLT by means of Esperanto are numerous.
Generally speaking one can say that since the insight about the
usefulness of a planned language's particular features for
natural-language processing, the whole DLT system design has tended to
move into the Esperanto part of the system all functions that are not
specific for particular source or target languages.  These are all
semantic and pragmatic processes of meaning disambiguation, word
choice, detection of semantic deixis and reference relations, etc.
So-called knowledge of the world has been stored in a lexical
knowledge bank and is consulted by a word expert system.  All these
applications of Artificial Intelligence are in DLT carried out
entirely in Esperanto.  Let it be said explicitly: Esperanto does not
serve as a programming language (DLT is implemented in Prolog and C),
but as a human language which renders the full content of the source
text being translated with all its nuances, disambiguates it and
conveys it to the second translation step to a target language."

Obviously, the existence of significant amounts of fully
disambiguated, machine-processable Esperanto text in such a
translation system opens up the possibility of wholesale mechanical
translation into Loglan.  This would be, obviously, particularly easy
if the (currently poorly-defined) semantics of the Loglan affix system
were brought into line with the existing semantics of the Esperanto
affix system.  In this case, bidirectional mechanical translation
between the two languages might become quite easy, possible producing
sort of an "instant literature" for the Loglanist.

Building a simple correspondence between Esperanto and Loglan affixes
is not as far-fetched an idea as it might first seem.  Esperanto, like
Loglan, uses a single root-stock of affixes which may be arbitrarily
concatenated to form compound words.  Where Loglan assigns *two* forms
to (most) concepts, a pred and an affix, Esperanto uniformly assigns
only a single affix (cutting the learning load in half!), but this
poses no particular intertranslation problem.  Loglan affixes are
designed to be uniquely resolvable, and Esperanto affixes are not, but
this problem has evidently already been solved, hence again poses no
particular problem to bidirectional translation.  Again, Loglan has a
(putatively) unambiguous grammar which Esperanto lacks, but this
problem has apparently already been satisfactorily resolved at the
Esperanto end.

 ----------------------------------------------------------------

Elsewhere on the affix front, Loglan has a set of affixes, but has
barely begun the enormous task of building the compound-word
vocabulary.  Loglan could learn from Esperanto on (at least) two
levels.

Most obviously, bringing the Loglan affix system into semantic
correspondence with the Esperanto affix system would open the door to
wholesale borrowing of Esperanto compound metaphors, capitalising on
the planned language community's multimegamanyear investment.  Unless
there are sound engineering concerns to the contrary (I see none),
there seems no reason to idly re-invent a wheel of this magnitude.
This ain't a DOD project, folks :-) There will be language bigots on
both sides opposed in principle to any cooperation, of course...

Less obviously, Loglan may be able to benefit from the design
knowledge gained from a century's experience with, and linguistic
study of, the Esperanto affix system.  Klaus Schubert's paper "An
unplanned development in planned languages: A study of word grammar"
is suggestive.  Zamenhof, like Jim Brown, paid no particular attention
to word formation in his original design, simply providing a uniform
stock of primitives which could be concatenated at will to create new
words.

Despite this lack of conscious planning, linguistic study of word
formation in Esperanto (started by Rene de Saussure -- not to be
confused with Ferdinand Saussure -- and continued by Sergej Kuznecov
and others), this simple *syntactic* combination rule has supported
the development of a systematic set of *semantic* combination rules.
These (unwritten and unconscious but nevertheless universal) semantic
combination rules allow the Esperantist, when faced with an unfamiliar
compound word, to not only decompose it into (usually) familiar
primitives, but also to somewhat systematically deduce the meaning of
the word.  Recent decades have apparently seen increasingly free use
of these facilities.

I won't attempt a summary of these semantic rules here, but will try
to give the flavor.  Even though the primitive stock *syntactically*
forms a single neutral pool, it appears that prims are *semantically*
treated in word combination by Experantists as being divided into
noun, verb and modifier (combined adverb/adjective) classes, which
combine with distinctively different rules.  This distinction provides
one dimension for sorting prims.

A second, orthogonal dimension sorts prims into the categories
independent morpheme, declension morpheme, ending (these first three
correspond roughly to Loglan's "little words"), affixoid, affix and
root (these final three correspond to the Loglan affix set).  These
affix types combine according to a word-compounding grammar which
allows the listener to distinguish (among other things) those
compounds whose meaning is directly deducable from the meaning of the
component prims, from those compounds whose meaning is metaphorical
and must be learned.

If Loglan were to borrow the Esperanto compound vocabulary wholesale,
it would of course, willy nilly, inherit these semantic regularities
as well.  Otherwise, it might be well to study these regularities and
consciously incorporate them in the Loglan vocabulary.

 -- Jeff  jsp@milton.u.washington.edu


The first response, from lojbab:

A response to issues raised in Jeff Prothero's book review of a book
on Interlinguistics, dated 12 Oct 1990.
(Contact uunet!milton.u.washington.edu!jsp if you didn't get and want this
review.)

1. Of the authors, Detlev Blanke is on our mailing list, but probably too
recently to have based anything he wrote on our material.

2. Jeff's description of the Netherlands translation project is good; we were
certainly aware of it.  Unfortunately, all descriptions of it were too short
and copywritten, so I have nothing I've been able to include in JL with any
authoritative information.  I'll try to put something together for next issue.

3. The Netherlands project is based on Esperanto - but with a caveat.  It uses
a formalized 'written' Esperanto form that may be slightly different from
spoken forms, but most importantly has disambiguating information encoded in
the way the language is written.  For example grouping of modifiers (our
'pretty little girls school' problem) is solved by using extra SPACES to
disambiguate which terms modify which.

4. Esperanto's affix system is similarly ambiguous, though not as bad as 1975
Loglan was.  I've been given a few examples.  Some handy ones are 'romano'
which is either a novel (root + no affix) or Roman (root Romo = Rome plus
affix -an-) and 'banano' which is either 'banana' or 'bather' (from 'bano'
= bath + -an- again).  I've been told there are others.  This type of
ambiguity presents no problem to a machine translator, which can store hyphens
to separate affixes etc.

5. I have not investigated Esperanto's affix system thoroughly, but it is not
compatible with Lojban's.  (We did ensure at one point that we had gismu, and
therefore rafsi corresponding to each of the affixes, though.)  Simply put,
Lojban has rafsi for EACH of its gismu.  Esperanto has only a couple of dozen,
and a MUCH larger root set.  Some Esperanto affixes have several Lojban
equivalents.  For example, we now have "na'e", "no'e" and "to'e" for scalar
negation of various sorts to correspond to Esperanto's "mal-".  Note that
Jeff did not mention the large root set in his comments.  Most of these roots
are combined by concatenation, like German.  But apparently as often as not
a new root is coined rather than concatenate, since Esperanto has no stigma
attached to borrowing.  But it is not true that Lojban has two forms while
Esperanto only has one.

6. The Esperanto affix/semantic system is probably even more poorly defined
than Lojban's.  As Jeff said, it is largely intuitive; this means independent
of a rule system.  However, there are rules; this was mentioned a few times in
the recent JL debates between Don Harlow, Athelstan and myself.  A guy named
Kalocszy apparently wrote up the rules early in this century; they are some
40-50 pages long and most Esperantists never read them much less learn them.
They also are apparently rather freely violated in actual usage; they were
descriptive of the known language, not prescriptive.  By the way, I suspect
that Lojban's compounding semantics is actually better-defined than it seems.
I just don't know enough about semantic theory to attempt to write it up.
Jim Carter wrote a paper several years ago, which we can probably offer for
distribution (or he can), on the semantics of compound place structures.  We
haven't adopted what he has said whole-hog, but it certainly has been
influential.

7. We will probably make extensive use of Esperanto dictionaries when we start
our buildup of the Lojban lujvo vocabulary.  We thus will not reinvent the
wheel in totality. BUT, we cannot do this freely for a large number of reasons.

a) our root set is different than theirs.  Some of their compounds will thus
not work.  The same is true of old Loglan words.  We've been held up on
translating Jim Carter's Akira story (the one he uses in all his guaspi
examples) from old Loglan to Lojban by this need to retranslate all the
compounds (which he used extensively and in ways inconsistent with our
current, better defined semantics).

b) as mentioned above, our affixes are not in 1-to-1 correspondence.

c) their compounds undoubtedly have a strong European bias.  I doubt if it
is as bad as Jim Brown's (who built the compound for 'to man a ship' from the
metaphor 'man-do'; i.e. 'to do as a man to'.  He also did 'kill' as 'dead-make'
where 'make' is the concept 'to make ... from materials ...'  Sounds more like
Frankenstein to me, folks.)  But I suspect Esperanto has a few zinger's in
there.  Indeed, I understand the Ido people criticized Esperanto most
significantly for its illogical word building, though I don't have details.
Perhaps Bruce Gilson (new to the list) could explain with examples??? And the
ESperantists among us will almost certainly have counters (Oboy Oboy!!! A
lively discussion!  Let's not get violent though.)  I also intend to draw
heavily from Chinese, which has a more Lojbanic tanru 'metaphor' system
since it doesn't ditinguish between nouns, verbs, and adjectives.  Esperanto
tries to get around this by allowing relatively free conversion between these
categories, but the root concepts are taken from European languages that
more rigidly categorize words, and their compounds probably reflect European
semantics.

d) Most importantly, Esperanto words are not gismu.  They do not have place
structures.  Lojban words do, and the affix semantics and compound semantics
must be consistent with those place structures.  We've covered this in previous
discussions in the guise of warning against 'figurative' metaphors that are
inconsistent with the place structures.

e) Nope.  Most importantly is another reason.  Lojban is its own language.
It should not be an encoded Esperanto any more then it should be an encoded
English.  I suspect that just like English words, Esperanto words sometimes
have diverse multiple context-dependent meanings (though again perhaps less
severely than English).  We want to minimize this occurance in Lojban if not
prevent it (we may not succeed, but we can try - the rule that every word
created must have a place structure is a good start.)

The bottom line is that each Esperanto word must be checked for validity, just
like any other lujvo proposal, but must also be translated into its closest
equivalent Lojban tanru as well, and have a place structure written, etc.  The
bulk of dictionary writing is this other work.  I can and have made new tanru/
lujvo (without working out the place structures) at the rate of several per
MINUTE for related concepts.  Coranth D'Gryphon posted a couple hundred
proposals to this list last December (that no one commented on), which he made
based on English definitions.  We have perhaps 200 PAGES of word proposals to
go through.  Nearly all of these have no place structures defined or are
defined haphazardly.

Lojban also has a multi-man-year isvestment behind it, though not 'mega'.  No,
Jeff, we aren't a DOD project, but in terms of people working on it and time
spent, we've far exceeded many such projects.  And word-building, whether for
better or worse, has received the greatest portion of that effort, since that
is all most people have felt competent to work on.  (Incidentally, the
Netherlands project IS a government sponsored project, if not defense-related.
If we had several million dollars, I think we'd be well along the way to a
translator ourselves.  Sheldon Linker has claimed that he could do a Lojban
conversing program with heuristic 'understanding' a la HAL 9000 in 5 man-years.
This is, in my mind, of comparable difficulty to a heuristic translation
program.  Any comments out there from those who know more than I do on this
subject?

--lojbab


And following is a response from Mike Urban, a noted Esperantist as well as
Lojbanist:

Date: Mon, 29 Oct 90 11:19:12 -0800 (PST)
From: Michael Urban <cbmvax!uunet!monty!rand.org!urban>
Subject: Re: response to J. Prothero book review and comments of 12 Oct 90

While I am a dyed-in-the-wool Esperantist, I agree that attempting to modify or
extend Lojban in imitation of various features of Esperanto would be a mistake
(I also lose patience with reformers who want to Lojbanify aspects of
Esperanto).

Esperanto's `affix system is ambiguous' to the extent that the language itself
is indeed lexically ambiguous.  Not only `affixes' but roots themselves are
combinable, and so it is possible to come up with endless puns like the
`ban-ano' ones you mentioned (`literaturo' might be a tower of letters, i.e., a
`litera turo').  Without the careful, but somewhat restrictive, phonological
rules that Loglan or Lojban provides, this kind of collision is inevitable.

The borrowing of words in Esperanto (`neologisms') instead of using a compound
form is a controversial topic.  Claude Piron, in his recent book, `La Bona
Lingvo', argues (quite convincingly, I think) that the tendency of *some*
esperantists to use neoloc)&[=\026\020usually from French, English, or Greek, is
partly based on pedanticism, partly based on Eurocentrism (``you mean,
*everyone* doesn't know what `monotona' means?''), partly a Francophone desire
to have a separate word for everything, and largely a failure to really Think
IN Esperanto, rather than translating.  In any case, the distinction in
Esperanto between affixes and root words has always been a thin one (Zamenhof
mentioned that you can do anything with an affix that you can do with a root),
and has been getting even thinner in recent years.  Combining by concatenation
is every bit as intrinsic to the language as the use of suffixes.

You asked about Ido and Esperanto.  While I have not looked at Ido in a number
of years, I recall that the main gripe of the Idists was not that Esperanto was
too European--indeed, one of their reforms was to discard Esperanto's rather
a-priori `correlative' system of relative pronouns (which works rather as if we
used `whus' instead of `how' for parallelism with `what/that, where/there') in
favor of a more latinate -- but unsystematic -- assortment of words.  If
anything, Idists tended towards a more Eurocentric (or Francocentric) view than
Esperantists did.  Ido's affix system, however, attempted to be more like
Loglan/lojban.  They took the view that predicates did not have intrinsic parts
of speech; thus any conversion of meaning through the use of affixes should be
`reversible'.  Thus, if `marteli' is `to hammer', then `martelo' *must* mean an
act of hammering, not (as in Esperanto) `a hammer'; or, if `martelo' means `a
hammer', then `marteli' must mean `to be a hammer'.  One result of this is a
somewhat larger assortment of affixes than Esperanto possesses, (for example, a
suffix that would transform a noun root `martelo' to a root meaning `to
hammer') with rather subtle shades of distinction in some cases.  The result is
a language that is only slightly more logical than Esperanto, but
proportionally harder to learn, and no less Eurocentric.

Linguistic tinkerers like the Idists underestimated the organic quality of
Esperanto, or of any living language.  Indeed, one of the valuable aspect of
Lojban or Loglan, if either ever develops a substantial population of fluent
speakers, will be to observe the extent to which the common usages of the
language diverge from the prescriptive definitions.  Such effects will, I
think, be easier to isolate and analyze in a language that was created `from
whole cloth' than in an a-posteriori language like Esperanto.

        Mike

Feel free to add your comments.  I'm sure there is much more worth saying.


Much better.  There appears to be a garbled word 'neologism' in the
middle of Urban's message.  John, please correct this.

lojbab