From cbmvax!uunet!CUVMA.BITNET!LOJBAN Mon Mar 2 11:20:41 1992 Return-Path: Received: by snark.thyrsus.com (/\==/\ Smail3.1.21.1 #21.19) id ; Mon, 2 Mar 92 11:20 EST Received: by cbmvax.cbm.commodore.com (5.57/UUCP-Project/Commodore 2/8/91) id AA26626; Mon, 2 Mar 92 02:45:27 EST Received: from cunixf.cc.columbia.edu by relay1.UU.NET with SMTP (5.61/UUNET-internet-primary) id AA01761; Mon, 2 Mar 92 01:29:25 -0500 Received: from cuvmb.cc.columbia.edu by cunixf.cc.columbia.edu (5.59/FCB) id AA13311; Mon, 2 Mar 92 01:29:25 EST Message-Id: <9203020629.AA13311@cunixf.cc.columbia.edu> Received: from CUVMB.COLUMBIA.EDU by CUVMB.COLUMBIA.EDU (IBM VM SMTP R1.2.1) with BSMTP id 0791; Mon, 02 Mar 92 01:27:47 EST Received: by CUVMB (Mailer R2.07) id 2652; Mon, 02 Mar 92 01:27:24 EST Date: Sun, 1 Mar 1992 16:19:18 PST Reply-To: cbmvax!uunet!pyramid.com!cuvmb.cc.columbia.edu!fschulz Sender: Lojban list From: cbmvax!uunet!PYRAMID.COM!cuvmb.cc.columbia.edu!fschulz Subject: lojbab morphology reply X-To: lojban list To: John Cowan , Eric Raymond , Eric Tiedemann Status: RO In a previous post I described a toy lojban like morphology using a regular expression grammar. I now suspect that not even a LR1 grammar is powerful enough to describe the lojban morphology so that approach is not going to work. lojbab answered some of my questions that arose during my efforts to understand the lojban morphology. I still do not follow all the details. I found his e-mail to be very informative and others may benefit from a posting. lojbab email follows: >From uunet!grebyn!lojbab Sun Mar 1 02:43:43 1992 >From: lojbab@grebyn.com (Logical Language Group) >To: c.j.fine@bradford.ac.uk, fschulz@pyramid.com >Subject: morphology I'm going to let Colin answer most of your questions, since I'm falling behind in my mail due to my added commitments supporting Athelstan's recovery, as well as my other Lojban work. These are the type of questions that should be asked on the general list. And answered there. If one person doesn't undertsand the morphology, then probably many don't. In talking about other morphology proposals, I was mentioning them because they had been looked at as CONTRASTS to Lojban, not as serious designs (except sometimes by the proposer). I don't believe any offer insights into the Lojban morphology - they are just other ways to design an unambiguous morphology. Using JCB's morphology was the only thing we considered, because we were not trying to invent a new conlang, but to reinvent Loglan. And the Loglan morphology is a distinctive feature of JCB's design. [ in previous mail lojbab mentions other morphologies have been discussed ] But for your curiosity: 1) end all words with a particular vowel found nowhere else. You can then define the innards of the words by any distinctive method without ambiguity in word breakup. 2) Rex May proposed what JCB call "Rexlan" which if I recall refuses to make a name/brivla distinction, or a cmavo/brivla distinction. All words are some number of C followed by some number of V followed by l/m/n/r. The simplest of these are the cmavo. The Lojban morphology is not that hard. It is just that the document that you read was trying to define the morphology rigorously, including all the crevasses that result when rules run into each other, or the human vocal tract. As taught in the textbook, this will become much simpler. You have cmavo down fine. gismu are CVCCV or CCVCV. They may be gramatically metaphorized into tanru, a very ambiguous process, as discussed in textbook lesson 4. They may also be compounded into lujvo which like gismu have a single meaning/place structure that may be somwhat different from the paralleling tanru (which always has the place structure of the final term of the tanru). The canonical lujvo structure is to replace the final V of the non-final gismu with 'y' and string them together. The result, unlike with tanru, is a single word. CVCCyCVCCV is an example. But this means that all 2 termers would be 4 syllables long, and there is an unambiguous way to shorten these. We assign shorter rafsi in place of the 5-letter forms of the gismu. These combining forms would not work on their own, but combined with each other, form words that do not overlap gismu space or cmavo space. Here though, the human vocal tract and hearing makes things complicated, because we can't recognize such things as a "cc" cluster. So we introduce very specific rules when a 'y' is required, and when it can be left out to make the word as short as possible. This makes the morphology 'harder', but easier to speak. Even with gismu and lujvo, there are two possible areas of morphological space that are unused. One, the consonant final words, are reserved for names. The pause need not be marked with the period, though such is standard, but it must be there are the final consonant, and before an initial vowel. The la/lai/doi exceptions are made not because their presence makes the result not a name, but because the result might lead to an ambiguous lexicalization of part of the name before the "la". The other chunk of morphological space is for words that end in a vowel, meet all morphological requirements of Lojban words, but CANNOT be broken down into rafsi, these being a lujvo. This irregular space, basically defined by what is NOT in it, cover the le'avla. And like I said, all the rest is just special cases, examples, and commentary on the implications of all this. You turn, colin? lojbab -- Frank Schulz ( fschulz@pyramid.com )