[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lojban] LALR1 question



On Fri, Aug 31, 2001 at 08:49:15PM -0400, Bob LeChevalier (lojbab) wrote:
> 
> These arguments still remain.  The technology apparently exists to do an 
> LR(4) version of the grammar, but we've baselined, and proving equivalence 
> is tougher than writing software.  We similar cannot formally prove that 
> the E-BNF is identical to the YACC grammar, which is why the latter takes 
> precedence over the former, if ever a conflict is found).
> 

I think I agree that LR(4) will deal with almost all the cases.
However, I still don't see how that would cope with rules like these
(from the EBNF) :

bridi-tail<50> = bridi-tail-1
        [gihek [stag] KE # bridi-tail /KEhE#/ tail-terms]
 
bridi-tail-1<51> = bridi-tail-2 [gihek # bridi-tail-2 tail-terms] ...

bridi-tail-2<52> = bridi-tail-3 [gihek [stag] BO # bridi-tail-2 tail-terms]

The problem comes when you're parsing bridi-tail-3 and encounter a gihek
(in the most general case, something like "na se gi'u nai").  After
scanning the gihek, you have to decide whether to reduce through 1, 2 or
3 rules.  This depends on whether you encounter "bo" or "ke" after an
optional simple tense (stag).  But unfortunately, a simple tense can
contain arbitrarily many tokens.  Hence, no finite value of 'k' can
bring the bo or ke within the lookahead window of an LR(k) parser in
this case.

I guess there's some clever trick being used to workaround this, but
I've not worked out what it is.  Can anyone enlighten me?

-- 
R.P.Curnow,Weston-super-Mare,UK  | C++: n., An octopus made by
http://www.rrbcurnow.freeuk.com/ |  nailing extra legs on a cat.