[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [jbovlaste] berbere, berberi



Jorge Llambías scripsit:

> Since we don't need to detect LALR-n-ambiguity anyway, why would
> this limitation of a PEG make it not good enough to parse the Lojban
> morphology?

Let me use a greatly oversimplified example.  Suppose we are writing a
morphology program to parse a word into a sequence of morphemes.
We define a morpheme as having the form V, CV, or CVn, where V and C
are any vowel and any consonant respectively.  If C does not include n,
this grammar is obviously unambiguous, as there is only one way to parse
any valid word into a sequence of morphemes.  If C does include n, this
grammar is obviously ambiguous: we do not know if "jana" parses as "jan a"
or "ja na".

Now if we write a YACC grammar for the latter case, like this:

C : 'j' | 'k' | 'l' | 'm' | 'n';
V : 'a' | 'e' | 'i' | 'o' | 'u';
morpheme: V | C V | C V 'n';
word : morpheme | word morpheme;

Yacc will tell us that there is a shift-reduce error.  This reflects
the fact that the grammar is ambiguous, and therefore unsuited for a
Lojban-style language.

But if we write a PEG grammar, we will not get a complaint: it will be all
about whether the morpheme rule is written as C V 'n' / C V / V (which
will prefer the parse "jan a") or C V / C V 'n' / V, (which will prefer
the parse "ja na").  It is in this sense that a PEG grammar is unsuitable
for Lojban: precisely because the PEG grammar settles all ambiguities in
advance, we cannot be sure that the text has only one possible analysis.
The only way to be sure is to put each alternation rule in the PEG into
every possible order, and make sure that all texts parse the same way
with all the variants.

-- 
John Cowan            http://www.ccil.org/~cowan     cowan@ccil.org
Uneasy lies the head that wears the Editor's hat! --Eddie Foirbeis Climo

_______________________________________________
jbovlaste mailing list
jbovlaste@lojban.org
http://mail.lojban.org/mailman/listinfo/jbovlaste