[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [lojban] LALR1 question
On Fri, Aug 31, 2001 at 08:49:15PM -0400, Bob LeChevalier (lojbab) wrote:
>
> These arguments still remain. The technology apparently exists to do an
> LR(4) version of the grammar, but we've baselined, and proving equivalence
> is tougher than writing software. We similar cannot formally prove that
> the E-BNF is identical to the YACC grammar, which is why the latter takes
> precedence over the former, if ever a conflict is found).
>
I think I agree that LR(4) will deal with almost all the cases.
However, I still don't see how that would cope with rules like these
(from the EBNF) :
bridi-tail<50> = bridi-tail-1
[gihek [stag] KE # bridi-tail /KEhE#/ tail-terms]
bridi-tail-1<51> = bridi-tail-2 [gihek # bridi-tail-2 tail-terms] ...
bridi-tail-2<52> = bridi-tail-3 [gihek [stag] BO # bridi-tail-2 tail-terms]
The problem comes when you're parsing bridi-tail-3 and encounter a gihek
(in the most general case, something like "na se gi'u nai"). After
scanning the gihek, you have to decide whether to reduce through 1, 2 or
3 rules. This depends on whether you encounter "bo" or "ke" after an
optional simple tense (stag). But unfortunately, a simple tense can
contain arbitrarily many tokens. Hence, no finite value of 'k' can
bring the bo or ke within the lookahead window of an LR(k) parser in
this case.
I guess there's some clever trick being used to workaround this, but
I've not worked out what it is. Can anyone enlighten me?
--
R.P.Curnow,Weston-super-Mare,UK | C++: n., An octopus made by
http://www.rrbcurnow.freeuk.com/ | nailing extra legs on a cat.