From cbmvax!uunet!cuvma.bitnet!LOJBAN Mon Mar  2 18:56:58 1992
Return-Path: <cbmvax!uunet!cuvma.bitnet!LOJBAN>
Received: by snark.thyrsus.com (/\==/\ Smail3.1.21.1 #21.19)
	id <m0lLMsL-0001u5C@snark.thyrsus.com>; Mon, 2 Mar 92 18:56 EST
Received: by cbmvax.cbm.commodore.com (5.57/UUCP-Project/Commodore 2/8/91)
	id AA29916; Mon, 2 Mar 92 14:53:44 EST
Received: from rutgers.edu by relay1.UU.NET with SMTP 
	(5.61/UUNET-internet-primary) id AA02141; Mon, 2 Mar 92 14:51:22 -0500
Received: from cbmvax.UUCP by rutgers.edu (5.59/SMI4.0/RU1.4/3.08) with UUCP 
	id AA12103; Mon, 2 Mar 92 13:22:43 EST
Received: by cbmvax.cbm.commodore.com (5.57/UUCP-Project/Commodore 2/8/91)
	id AA12754; Mon, 2 Mar 92 13:06:48 EST
Received: from CUVMB.COLUMBIA.EDU (via uunet.UU.NET) by relay2.UU.NET with SMTP 
	(5.61/UUNET-internet-primary) id AA09793; Mon, 2 Mar 92 12:38:40 -0500
Message-Id: <9203021738.AA09793@relay2.UU.NET>
Received: from CUVMB.COLUMBIA.EDU by CUVMB.COLUMBIA.EDU (IBM VM SMTP R1.2.1) with BSMTP id 1704; Mon, 02 Mar 92 12:37:10 EST
Received: by CUVMB (Mailer R2.07) id 5416; Mon, 02 Mar 92 12:34:11 EST
Date:         Mon, 2 Mar 1992 11:48:14 GMT
Reply-To: CJ FINE <cbmvax!uunet!cuvmb.cc.columbia.edu!C.J.Fine>
Sender: Lojban list <cbmvax!uunet!cuvmb.cc.columbia.edu!LOJBAN>
From: CJ FINE <cbmvax!uunet!cuvmb.cc.columbia.edu!C.J.Fine>
Subject:      Re: morphology
X-To:         fschulz@pyramid.com
X-Cc:         Lojban list <lojban@cuvmb.cc.columbia.edu>
To: John Cowan <cowan@snark.thyrsus.com>
In-Reply-To:  <9202292323.AA07581@pyrps5.eng.pyramid.com>; from
              "fschulz@com.pyramid" at Feb 29, 92 3:23 pm
Status: RO

Frank continues the discussion:
(Lojbab answered much of the following in a mail beginning "I will let
Colin answer most of your questions!")
>
>
> I will reply to lojbab and kolin with one reply since it looks
> like the message I received went to both.
>
> lojbab says my morphology is ambiguous and does not distinguish
> tanru and lujvo.
> I do not understand the distinction between tanru and lujvo.
> What are the differences? I thought lujvo were just compressed tanru.

They are, but the "compression" has as a requirement that lujvo stick
together as a single word. There are at least three reasons why the
distinction between
        fraso prenu     (tanru)
        and frasyprenu/fasprenu/frasypre/faspre (lujvo)
is important:
1) a tanru has (by definition the place structure of its last term - a
        lujvo may have a different place structure
2) in a more complex tanru, a sequence of brivla may not even be a tanru
        (eg "carmi fraso prenu" parses as "[carmi fraso] prenu" and does
        not contain "fraso prenu" as a constituent at all. "carmi
        faspre" does, obviously, contain "faspre").
3) words like "ba'e" and "zo", which operate on a valsi (a single
        lojban word) pick up one brivla, whether it is a gismu or a
        lujvo. They will not pick up a tanru.

In Loglan before the GMR, gismu and lujvo were indistinguishable by
their form - this turned out to be unworkable, hence the changes.

> lojbab writes
>   la pier laplas. tadni lo cmaci
> In the written form is
>       la pier. laplas.
> necessary or is the pause assumed by default? Or is there an ambiguity
> here which is resolved by knowing the name? I see this as
>       Pierrelaplace
> Machine translation will need a special lookup here anyway, so this
> ambiguity might be harmless.

"pier" and "lyples" are valid Lojban cmene, and so is "pierlyples". It
is a matter of choice (the se cmene or the te cmene) which is used. Each
name must be followed by a pause in speech - in writing this is optional
(though recommended).
"laplas" is *not* a valid cmene as it contains the syllable "la" (twice). Under
the new not-yet-baselined proposal, however "pierlaplas" will be valid,
as each "la" is preceded by a consonant.

>
> lojbab mentioned that other morphology structures have been proposed.
> If these are real simple I would like to see them. I would prefer
> to see a morphology which is simple to understand and has a serious
> flaw than one which is correct and impossible for me to understand.
> My idea is to look at several morphology types with different kinds
> of flaws to sneak up on the lojban morphology. Of course the flaw
> must be explicitly mentioned, so the flaw is not mistaken for an
> oversight.

The whole question of "simplicity" of morphology needs to be teased
apart. There are three separate parts to it:
1) How easy is it for a hearer/reader to parse the speech stream?
2) How robust is it - ie are errors likely to be recognised or to be
misparsed as something else?
3) How easy is it for the word-coiner to apply ?

The first is much the most important - and on this score, Lojban
morphology is simple:
        Divide the speech-stream into pause groups
        If a group ends with a consonant, it ends with a cmene.
        Otherwise, if it contains a consonant cluster (possibly buffered
                with "y"), it contains a brivla
        Otherwise it's made up of cmavo.
Then three rules for finding the boundaries of these:
        A cmene starts after the last "la", "lai", "la'i" or "doi", or
                at the start of the pause group;
        A brivla starts at the first cluster if that is a permissible
                initial, or the previous consonant otherwise;
        A brivla ends after the next vowel after the stress.

The rules you need as coiner are more complicated, it is true. And the
problem of robustness is why I strongly favour the limitation on the
structure of le'avla that was proposed a while ago (and is not yet
official, I think?)

>
> When I finally understand the morphology I would like to
> to verify the lojban morphology is not ambiguous using formal
> verification techniques and write
> computer code to test words for morphological correctness.
> This would the the morphology analog of a spelling checker.
> I suspect this would turn up a lot of errors. This assumes
> that this has not yet been done.

I believe it has, but I am not sure.

>
> kolin mentions that my use of the term gismu in my toy morphology
> description is not standard. This is correct. Is there some cmavo or
> gismu that means metaphorical? This should have been prefixed to
> gismu. I did not want to coin new vocabulary so I used the words
> incorrectly. What I intended was something that had the functional
> properties and behavior of a gismu. Does a generalizer cmavo exist?
> This would express my intention better.

"pe'a"/"po'a" marks a metaphorical expression (only a tanru, I think,
but I am not sure of the grammar.
 But I don't think that was my point - lojban has the following
classification of valsi:

        cmene
        brivla
        cmavo
At the syntactic level, these are the only categories. *derivationally*
(and, by design, morphologically), we can then subdivide brivla into
        gismu
        lujvo
        le'avla
but syntactically they are identical


                kolin
                        c.j.fine@bradford.ac.uk