From marob!hombre!uunet!cbmvax!snark!lojbab Mon Dec 10 07:47:25 1990 Received: by magpie.MASA.COM (smail2.5) id AA17303; 10 Dec 90 07:47:23 EST (Mon) Received: by marob.uucp (/\=-/\ Smail3.1.18.1 #18.1) id ; Mon, 10 Dec 90 07:36 EST Received: by hombre.MASA.COM (smail2.5) id AA06222; 10 Dec 90 07:17:14 EST (Mon) Received: from cbmvax.UUCP by uunet.UU.NET (5.61/1.14) with UUCP id AA24448; Sat, 1 Dec 90 22:43:35 -0500 Received: by cbmvax.cbm.commodore.com (5.57/UUCP-Project/Commodore Jan 13 1990) id AA27690; Sat, 1 Dec 90 22:35:18 EST Received: by snark.thyrsus.com (/\=-/\ Smail3.1.18.1 #18.14) id ; Sat, 1 Dec 90 22:06 EST Received: by snark.thyrsus.com (/\=-/\ Smail3.1.18.1 #18.14) id ; Sat, 24 Nov 90 03:15 EST Message-Id: Date: Sat, 24 Nov 90 03:15 EST From: marob!uunet!cbmvax!snark.thyrsus.com!lojbab (Bob LeChevalier) To: cowan@marob.masa.com, eubanks@uhunix.uhcc.hawaii.edu, jeannec@well.sf.ca.us Subject: 2nd try at Prothero discussion re xmit Status: RO Enclosed are the three messages containing the essence of the discussion of the last month. Jeanne apparently got none of these, and Brian perhaps only some of them. John Cowan, who will normally handle requests for back messages could not get these back since it had been too long, so I am sending him a copy in case the topic comes up, and to include in a file on Lojban and Esperanto we will be making available by ftp file server. Date: Fri, 12 Oct 90 12:09:27 -0700 From: Jeff Prothero To: lojban-list@snark.thyrsus.com Subject: Book review (long) I've been poking through the Linguistics section of the campus library, and found a book which might interest other Loglanists: Trends in Linguistics Studies and Monographs 42: Interlinguistics Aspects of the Science of Planned Languages Klaus Schubert (Ed.) Mouton de Gruyter 1989 ISBN 3-11-011910-2 The book is 350 pages, in print, and costs $45 in Seattle. "This book ... is an invitation to all those interested in languages and linguistics to make themselves acquainted with some recent streams of scientific discussion in the field of planned languages." The book is a collection of fifteen recent papers in interlinguistics. For folks who (like me) haven't been following the field, the bibliographies provide an up-to-date set of pointers into the literature, plus some overviews of it. I think the table of contents gives an adequate idea of the scope and focus of the book: -------------------------------------------------------------------- Table of contents: Part I: I Introductions Andre Martinet: The proof of the pudding Klaus Schubert: Interlinguistics - its aims, its achievements, and its place in language science. Part II: Planned Languages in Linguistics Aleksandr D Dulicenko: Ethnic language and planned language. Detlev Blanke: Planned languages - a survey of some of the main problems. Sergej N Kuznecov: Interlinguistics: a branch of applied linguistics? Part III: Languages Design and Language Change Dan Maxwell: Principles for constructing Planned Languages Francois Lo Jacomo: Optimization in language planning Claude Piron: A few notes on the evolution of Esperanto Part IV: Sociolinguistics and Psycholinguistics Jonathan Pool - Bernard Grofman: Linguistic artificiality and cognitive competence Claude Piron: Who are the speekers of Esperanto Tazio Carlevaro: Planned auxiliary language and communicative competence. Part V: The Language of Literature Manuel Halvelik: Planning nonstandard language Pierre Janton: If Shakespeare had written in Esperanto Part VI: Grammar Probal Dasgupta: Degree words in Esperanto and categories in Universal Grammar Klaus Schubert: An unplanned development in planned languages. Part VII: Terminology and Computational Lexicography Wera Blanke: Terminological standardization - its roots and fruits in planned languages Rudiger Eichholz: Terminics in the interethnic language Victor Sadler: Knowledge-driven terminography for machine translation Index -------------------------------------------------------------------- I'm not a linguist, and won't attempt to review the book from a linguistics point of view, but I will highlight some points of particular interest to Loglanists: First, there is some mention of Loglan (and the thousand-odd other artificial language projects to date), but the bulk of the focus is on Esperanto, for the simple reason that 99.9% of fluent planned-language users speak Esperanto, and a similar percentage of the written-text corpus from the planned language community is in Esperanto. (Any Loglanists who cannot tolerate mention of That Language are invited to stop reading at this point. :-) Second, I (and perhaps most Loglanists) was unaware of the Distributed Language Translation project, which seems to be of considerable potential interest to Loglanists. Quoting copyrighted material without permission: "Distributedd Language Translation is the name of a long-term research and development project carried out by the BSO software house in Utrecht with funding from the Netherlands Ministry of Economic Affairs. For the present seven year period (1985-1991) it has a budget of 17 million guilders... Although much larger in size than earlier attempts, DLT started off as just another project of the second stage, using Esperanto as its intermediate language. Esperanto had been judged suitable for this purpose because of its highly regular syntax and morphology and because its agglutinative nature promised an especially efficient possibility of morpheme-based coding of messages for network transmission. During the course of the first years of the large-scale practical development, however, the role of Esperanto in the DLT system increased substantially. the intermediate language took over more and more processes originally designed to be carried out either in the source or in the target languages of the multilingual system. When I consider the DLT system to be one step more highly developed than the earlier implementations involving Esperanto, it is because the increase in the role of Esperanto was due to intrinsic qualities of Esperanto as a planned language. In other words, Esperanto is in DLT no longer treated as any other language (which incidentally has a somewhat more computer-friendly grammar than other languages), but it is now used in DLT for a large part of the overall translation process _because of its special features as a planned language_. Some facets of this complex application are discussed by Sadler (in this volume.) "The functions fulfilled in DLT by means of Esperanto are numerous. Generally speaking one can say that since the insight about the usefulness of a planned language's particular features for natural-language processing, the whole DLT system design has tended to move into the Esperanto part of the system all functions that are not specific for particular source or target languages. These are all semantic and pragmatic processes of meaning disambiguation, word choice, detection of semantic deixis and reference relations, etc. So-called knowledge of the world has been stored in a lexical knowledge bank and is consulted by a word expert system. All these applications of Artificial Intelligence are in DLT carried out entirely in Esperanto. Let it be said explicitly: Esperanto does not serve as a programming language (DLT is implemented in Prolog and C), but as a human language which renders the full content of the source text being translated with all its nuances, disambiguates it and conveys it to the second translation step to a target language." Obviously, the existence of significant amounts of fully disambiguated, machine-processable Esperanto text in such a translation system opens up the possibility of wholesale mechanical translation into Loglan. This would be, obviously, particularly easy if the (currently poorly-defined) semantics of the Loglan affix system were brought into line with the existing semantics of the Esperanto affix system. In this case, bidirectional mechanical translation between the two languages might become quite easy, possible producing sort of an "instant literature" for the Loglanist. Building a simple correspondence between Esperanto and Loglan affixes is not as far-fetched an idea as it might first seem. Esperanto, like Loglan, uses a single root-stock of affixes which may be arbitrarily concatenated to form compound words. Where Loglan assigns *two* forms to (most) concepts, a pred and an affix, Esperanto uniformly assigns only a single affix (cutting the learning load in half!), but this poses no particular intertranslation problem. Loglan affixes are designed to be uniquely resolvable, and Esperanto affixes are not, but this problem has evidently already been solved, hence again poses no particular problem to bidirectional translation. Again, Loglan has a (putatively) unambiguous grammar which Esperanto lacks, but this problem has apparently already been satisfactorily resolved at the Esperanto end. ---------------------------------------------------------------- Elsewhere on the affix front, Loglan has a set of affixes, but has barely begun the enormous task of building the compound-word vocabulary. Loglan could learn from Esperanto on (at least) two levels. Most obviously, bringing the Loglan affix system into semantic correspondence with the Esperanto affix system would open the door to wholesale borrowing of Esperanto compound metaphors, capitalising on the planned language community's multimegamanyear investment. Unless there are sound engineering concerns to the contrary (I see none), there seems no reason to idly re-invent a wheel of this magnitude. This ain't a DOD project, folks :-) There will be language bigots on both sides opposed in principle to any cooperation, of course... Less obviously, Loglan may be able to benefit from the design knowledge gained from a century's experience with, and linguistic study of, the Esperanto affix system. Klaus Schubert's paper "An unplanned development in planned languages: A study of word grammar" is suggestive. Zamenhof, like Jim Brown, paid no particular attention to word formation in his original design, simply providing a uniform stock of primitives which could be concatenated at will to create new words. Despite this lack of conscious planning, linguistic study of word formation in Esperanto (started by Rene de Saussure -- not to be confused with Ferdinand Saussure -- and continued by Sergej Kuznecov and others), this simple *syntactic* combination rule has supported the development of a systematic set of *semantic* combination rules. These (unwritten and unconscious but nevertheless universal) semantic combination rules allow the Esperantist, when faced with an unfamiliar compound word, to not only decompose it into (usually) familiar primitives, but also to somewhat systematically deduce the meaning of the word. Recent decades have apparently seen increasingly free use of these facilities. I won't attempt a summary of these semantic rules here, but will try to give the flavor. Even though the primitive stock *syntactically* forms a single neutral pool, it appears that prims are *semantically* treated in word combination by Experantists as being divided into noun, verb and modifier (combined adverb/adjective) classes, which combine with distinctively different rules. This distinction provides one dimension for sorting prims. A second, orthogonal dimension sorts prims into the categories independent morpheme, declension morpheme, ending (these first three correspond roughly to Loglan's "little words"), affixoid, affix and root (these final three correspond to the Loglan affix set). These affix types combine according to a word-compounding grammar which allows the listener to distinguish (among other things) those compounds whose meaning is directly deducable from the meaning of the component prims, from those compounds whose meaning is metaphorical and must be learned. If Loglan were to borrow the Esperanto compound vocabulary wholesale, it would of course, willy nilly, inherit these semantic regularities as well. Otherwise, it might be well to study these regularities and consciously incorporate them in the Loglan vocabulary. -- Jeff jsp@milton.u.washington.edu The first response, from lojbab: A response to issues raised in Jeff Prothero's book review of a book on Interlinguistics, dated 12 Oct 1990. (Contact uunet!milton.u.washington.edu!jsp if you didn't get and want this review.) 1. Of the authors, Detlev Blanke is on our mailing list, but probably too recently to have based anything he wrote on our material. 2. Jeff's description of the Netherlands translation project is good; we were certainly aware of it. Unfortunately, all descriptions of it were too short and copywritten, so I have nothing I've been able to include in JL with any authoritative information. I'll try to put something together for next issue. 3. The Netherlands project is based on Esperanto - but with a caveat. It uses a formalized 'written' Esperanto form that may be slightly different from spoken forms, but most importantly has disambiguating information encoded in the way the language is written. For example grouping of modifiers (our 'pretty little girls school' problem) is solved by using extra SPACES to disambiguate which terms modify which. 4. Esperanto's affix system is similarly ambiguous, though not as bad as 1975 Loglan was. I've been given a few examples. Some handy ones are 'romano' which is either a novel (root + no affix) or Roman (root Romo = Rome plus affix -an-) and 'banano' which is either 'banana' or 'bather' (from 'bano' = bath + -an- again). I've been told there are others. This type of ambiguity presents no problem to a machine translator, which can store hyphens to separate affixes etc. 5. I have not investigated Esperanto's affix system thoroughly, but it is not compatible with Lojban's. (We did ensure at one point that we had gismu, and therefore rafsi corresponding to each of the affixes, though.) Simply put, Lojban has rafsi for EACH of its gismu. Esperanto has only a couple of dozen, and a MUCH larger root set. Some Esperanto affixes have several Lojban equivalents. For example, we now have "na'e", "no'e" and "to'e" for scalar negation of various sorts to correspond to Esperanto's "mal-". Note that Jeff did not mention the large root set in his comments. Most of these roots are combined by concatenation, like German. But apparently as often as not a new root is coined rather than concatenate, since Esperanto has no stigma attached to borrowing. But it is not true that Lojban has two forms while Esperanto only has one. 6. The Esperanto affix/semantic system is probably even more poorly defined than Lojban's. As Jeff said, it is largely intuitive; this means independent of a rule system. However, there are rules; this was mentioned a few times in the recent JL debates between Don Harlow, Athelstan and myself. A guy named Kalocszy apparently wrote up the rules early in this century; they are some 40-50 pages long and most Esperantists never read them much less learn them. They also are apparently rather freely violated in actual usage; they were descriptive of the known language, not prescriptive. By the way, I suspect that Lojban's compounding semantics is actually better-defined than it seems. I just don't know enough about semantic theory to attempt to write it up. Jim Carter wrote a paper several years ago, which we can probably offer for distribution (or he can), on the semantics of compound place structures. We haven't adopted what he has said whole-hog, but it certainly has been influential. 7. We will probably make extensive use of Esperanto dictionaries when we start our buildup of the Lojban lujvo vocabulary. We thus will not reinvent the wheel in totality. BUT, we cannot do this freely for a large number of reasons. a) our root set is different than theirs. Some of their compounds will thus not work. The same is true of old Loglan words. We've been held up on translating Jim Carter's Akira story (the one he uses in all his guaspi examples) from old Loglan to Lojban by this need to retranslate all the compounds (which he used extensively and in ways inconsistent with our current, better defined semantics). b) as mentioned above, our affixes are not in 1-to-1 correspondence. c) their compounds undoubtedly have a strong European bias. I doubt if it is as bad as Jim Brown's (who built the compound for 'to man a ship' from the metaphor 'man-do'; i.e. 'to do as a man to'. He also did 'kill' as 'dead-make' where 'make' is the concept 'to make ... from materials ...' Sounds more like Frankenstein to me, folks.) But I suspect Esperanto has a few zinger's in there. Indeed, I understand the Ido people criticized Esperanto most significantly for its illogical word building, though I don't have details. Perhaps Bruce Gilson (new to the list) could explain with examples??? And the ESperantists among us will almost certainly have counters (Oboy Oboy!!! A lively discussion! Let's not get violent though.) I also intend to draw heavily from Chinese, which has a more Lojbanic tanru 'metaphor' system since it doesn't ditinguish between nouns, verbs, and adjectives. Esperanto tries to get around this by allowing relatively free conversion between these categories, but the root concepts are taken from European languages that more rigidly categorize words, and their compounds probably reflect European semantics. d) Most importantly, Esperanto words are not gismu. They do not have place structures. Lojban words do, and the affix semantics and compound semantics must be consistent with those place structures. We've covered this in previous discussions in the guise of warning against 'figurative' metaphors that are inconsistent with the place structures. e) Nope. Most importantly is another reason. Lojban is its own language. It should not be an encoded Esperanto any more then it should be an encoded English. I suspect that just like English words, Esperanto words sometimes have diverse multiple context-dependent meanings (though again perhaps less severely than English). We want to minimize this occurance in Lojban if not prevent it (we may not succeed, but we can try - the rule that every word created must have a place structure is a good start.) The bottom line is that each Esperanto word must be checked for validity, just like any other lujvo proposal, but must also be translated into its closest equivalent Lojban tanru as well, and have a place structure written, etc. The bulk of dictionary writing is this other work. I can and have made new tanru/ lujvo (without working out the place structures) at the rate of several per MINUTE for related concepts. Coranth D'Gryphon posted a couple hundred proposals to this list last December (that no one commented on), which he made based on English definitions. We have perhaps 200 PAGES of word proposals to go through. Nearly all of these have no place structures defined or are defined haphazardly. Lojban also has a multi-man-year isvestment behind it, though not 'mega'. No, Jeff, we aren't a DOD project, but in terms of people working on it and time spent, we've far exceeded many such projects. And word-building, whether for better or worse, has received the greatest portion of that effort, since that is all most people have felt competent to work on. (Incidentally, the Netherlands project IS a government sponsored project, if not defense-related. If we had several million dollars, I think we'd be well along the way to a translator ourselves. Sheldon Linker has claimed that he could do a Lojban conversing program with heuristic 'understanding' a la HAL 9000 in 5 man-years. This is, in my mind, of comparable difficulty to a heuristic translation program. Any comments out there from those who know more than I do on this subject? --lojbab And following is a response from Mike Urban, a noted Esperantist as well as Lojbanist: Date: Mon, 29 Oct 90 11:19:12 -0800 (PST) From: Michael Urban Subject: Re: response to J. Prothero book review and comments of 12 Oct 90 While I am a dyed-in-the-wool Esperantist, I agree that attempting to modify or extend Lojban in imitation of various features of Esperanto would be a mistake (I also lose patience with reformers who want to Lojbanify aspects of Esperanto). Esperanto's `affix system is ambiguous' to the extent that the language itself is indeed lexically ambiguous. Not only `affixes' but roots themselves are combinable, and so it is possible to come up with endless puns like the `ban-ano' ones you mentioned (`literaturo' might be a tower of letters, i.e., a `litera turo'). Without the careful, but somewhat restrictive, phonological rules that Loglan or Lojban provides, this kind of collision is inevitable. The borrowing of words in Esperanto (`neologisms') instead of using a compound form is a controversial topic. Claude Piron, in his recent book, `La Bona Lingvo', argues (quite convincingly, I think) that the tendency of *some* esperantists to use neoloc)&[=\026\020usually from French, English, or Greek, is partly based on pedanticism, partly based on Eurocentrism (``you mean, *everyone* doesn't know what `monotona' means?''), partly a Francophone desire to have a separate word for everything, and largely a failure to really Think IN Esperanto, rather than translating. In any case, the distinction in Esperanto between affixes and root words has always been a thin one (Zamenhof mentioned that you can do anything with an affix that you can do with a root), and has been getting even thinner in recent years. Combining by concatenation is every bit as intrinsic to the language as the use of suffixes. You asked about Ido and Esperanto. While I have not looked at Ido in a number of years, I recall that the main gripe of the Idists was not that Esperanto was too European--indeed, one of their reforms was to discard Esperanto's rather a-priori `correlative' system of relative pronouns (which works rather as if we used `whus' instead of `how' for parallelism with `what/that, where/there') in favor of a more latinate -- but unsystematic -- assortment of words. If anything, Idists tended towards a more Eurocentric (or Francocentric) view than Esperantists did. Ido's affix system, however, attempted to be more like Loglan/lojban. They took the view that predicates did not have intrinsic parts of speech; thus any conversion of meaning through the use of affixes should be `reversible'. Thus, if `marteli' is `to hammer', then `martelo' *must* mean an act of hammering, not (as in Esperanto) `a hammer'; or, if `martelo' means `a hammer', then `marteli' must mean `to be a hammer'. One result of this is a somewhat larger assortment of affixes than Esperanto possesses, (for example, a suffix that would transform a noun root `martelo' to a root meaning `to hammer') with rather subtle shades of distinction in some cases. The result is a language that is only slightly more logical than Esperanto, but proportionally harder to learn, and no less Eurocentric. Linguistic tinkerers like the Idists underestimated the organic quality of Esperanto, or of any living language. Indeed, one of the valuable aspect of Lojban or Loglan, if either ever develops a substantial population of fluent speakers, will be to observe the extent to which the common usages of the language diverge from the prescriptive definitions. Such effects will, I think, be easier to isolate and analyze in a language that was created `from whole cloth' than in an a-posteriori language like Esperanto. Mike Feel free to add your comments. I'm sure there is much more worth saying. Much better. There appears to be a garbled word 'neologism' in the middle of Urban's message. John, please correct this. lojbab