From LOJBAN%CUVMB.bitnet@YaleVM.YCC.YALE.EDU Sat Mar 6 22:54:51 2010 Received: from YALEVM.YCC.YALE.EDU by MINERVA.CIS.YALE.EDU via SMTP; Fri, 12 Mar 1993 09:05:20 -0500 Received: from CUVMB.CC.COLUMBIA.EDU by YaleVM.YCC.Yale.Edu (IBM VM SMTP V2R2) with BSMTP id 6398; Fri, 12 Mar 93 09:04:14 EST Received: from CUVMB.BITNET by CUVMB.CC.COLUMBIA.EDU (Mailer R2.07) with BSMTP id 9873; Fri, 12 Mar 93 04:23:20 EST Date: Fri, 12 Mar 1993 16:44:05 +1000 Reply-To: Nick Nicholas Sender: Lojban list From: Nick Nicholas Subject: TECH: AI Project Proposal X-To: Lojban Mailing List To: Erik Rauch Status: OR X-From-Space-Date: Sat Mar 13 02:44:05 1993 X-From-Space-Address: @YaleVM.YCC.YALE.EDU:LOJBAN@CUVMB.BITNET Message-ID: I am *not* asking any of you to do my homework for me! :) (In fact, I ask that you not). BUT: I've decided to do a semester project on a Lojban to Prolog translator, and would ask your comments on the following proposal. And if possible, make your comments soon: I want to have this proposal on my supervisor's desk first think Monday morning (Sunday afternoon, US Time; Sunday evening, UK Time). I'd particularly welcome suggestions on what parts of Lojban grammar I should admit: I suspect the program presented here is too ambitious for the 60 hours of work I had in mind. [cut here] Project Proposal for 433-603: A Lojban-to-Prolog translator. Lojban is an artificial language intended for human use, of the type exemplified by Esperanto and Interlingua. It is an offshoot of an earlier such language, Loglan, and shares with it the ostensible _raison d'etre_ of helping test the Sapir-Whorf hypothesis by deviating from natural human language in a known, predictable manner, and observing whether this deviation would have a noticeable effect on its speakers' thinking. The hypothesis and the merits of its testing in this manner are not held in universal esteem in the Lojban community; I, for one, am quite skeptical that such a project can or need be realised. What is of interest here is that, in order to implement this 'deviation', the languages have been explicitly based on predicate logic. Predicates serve the role of verbs, predicates with determines preposed serve the role of nouns, and predications serve as sentences. The immediate question that arises on considering such a language is: to what extent can a logic (not necessarily First Order) adequately express what human language can express? Again, this is a matter of contention within the community; I believe that, if Lojban acquires a true (second-language) speech community, its speakers will end up speaking a human language no matter what, and that in the conflict between logic and human language instinct, the latter will win (this is complicated by the considerable risk that the language will end up as a code for English, with little autonomy from that language; this tendency has been resisted so far). Nonetheless, the language already provides a model for natural language, with considerable expressive power, and with an affinity to "logical forms" through its predicate logic origins, that make a Lojban-to-Prolog translator an appealing task. The translator, given a text in Lojban, would construct a Prolog database storing the information denoted in the text. The task is simplified in that an unambiguous context free grammar exists for the language (implemented in YACC, with some imaginative use of error recovery, but retaining LALR(1) nature). But even though syntactical considerations have already been dealt with, most of the semantic issues complicating logic-programming-based knowledge representation of natural language remain in Lojban: higher-order predicates; metalinguistic comments and attitudinals; the ambiguous semantic relationship of head and modifier in word compounding; the representation of numbers, prepositional phrases, non-logical connectives, negation, tense and modality; the distinction between "the" and "a", partly echoed in the language's veridical and non-veridical determiners; the distinction between individual and collective plurals; subject-raising; relative clauses, and so forth. In effect, a Lojban-to-Prolog translator would be addressing many of the current issues in NLP knowledge representation, though it would be biased towards predicate logic in the way it does so. With the way Lojban grammar is structured, results will become available a short time into the project without being distracted by parsing issues or syntactic ambiguity. In order to keep the project manageable, a subset of the language will have to be considered; this is in line with the Lojban Canonicaliser proposed by John Cowan. The subset of Lojban I intend to work on is be described as follows, in stages: 1. Simple predications with a known predicate, and with arguments without internal structure (Proper names, logical variables). No quantification other than existential. eg. mi prami da --- There exists an X such that LOVES(i,X). 2. Non-Veridical arguments (cf. English "the") based on predicates, with internal arguments. eg. mi catra le prami be le pulji --- KILLS(i,x) & LOVE(x,y) & POLICE(y): I kill the lover of the policeman. 3. Veridical arguments (cf. English "an") based on predicates, with internal arguments. eg. mi catra lo prami be lo pulji --- There exist X and Y such that: KILLS(i,X) & LOVE(X,Y) & POLICE(Y): I kill a lover of a policeman. 4. Resolution of logical connectives. eg. mi nelci do .e ko'a ---> mi nelci do .ije mi nelci ko'a --- LIKES(i,you) & LIKES(i,x1): I like you and him. 5. Restrictive and non-restrictive relative clauses. eg. mi nelci le prenu poi do xebni ke'a --- (There exists x such that HATES(you,x)) & LIKES(i,x) & person(x): I like the person you hate. 6. Higher order predicates. eg. lenu mi cadzu cu nandu --- DIFFICULT(event:WALKS(i)): My walking is difficult. 7. Prepositional phrases (other than tense and location). eg. mi naumau do nelci ko'a ---> mi zmadu do leni da nelci ko'a --- EXCEEDS(i,you,quantity:LIKES(X,x1)): I like him more than you. 8. Attitudinals. eg. mi .ui sidju do ---> mi sidju do .ije mi gleki mi va'o lenu mi sidju do: HELP(i,you) & HAPPY(i,i) & CONTEXT((state:HAPPY(i,i),event:HELP(i,you)): I *smile* will help you, I am happy to help you. 9. Tense (including location), and prepositions of tense (including location). Also includes modality and event contours. eg. mi ba'o tavla ---> lenu mi tavla cu ba'o zei balvi zo'e: AFTERMATH(event:talk(i,_,_,_),_): I have spoken. 10. Non-logical connectors. eg. la gilbrt. joi la salivn. cu finti la mikadon. --- INVENT(X,mikado) & JOINT_MASS(X,gilbert,sullivan): G & S (as a joint unit) wrote The Mikado. 11. Masses and sets as arguments. eg. loi remna cu sipna: the mass of humans sleep (Even though it is not true at any given moment that For all X: HUMAN(X) => SLEEPS(X) 12. Quantification (including numerical): eg. mu le ze mensi cu cucycau: five of the seven sisters are barefoot. 13, Negation. Contradictory, scalar. Use of prenexes. mi naku ro prenu cu prami: NOT(For all X:PERSON(X), LOVES(i,X)) mi ro prenu na prami: For all X:PERSON(X), NOT(LOVES(i,X)) Sections of Lojban Grammar not anticipated to be included in the model: 1. The mathematical subgrammar of Lojban. 2. Any analysis of word compounds. 3. Metalinguistic comments. The detail of coverage of some sections, particularly tense, will probably have to be curtailed due to time constraints. It is anticipated to have this project take at most 80 hours of work. Enclosures: [Bits of JL, John Cowan's sketch of a Lojban Canonicaliser] Momenton senpretende paseman mi retenis kaj # [Victor Sadler, _Memkritiko_ 90] kultis kvazaux & (NICK NICHOLAS. Melbourne. senhorlogxan elizeon # Australia. IRC: nicxjo. (Dume: & nsn@munagin.ee.mu.oz.au .)