[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lojban] LALR1 question



On Mon, 27 Aug 2001, And Rosta wrote:

> Take the notorious example:
>
> le broda joi le brode
> 1  2     3   4
>
> If "1-token lookahead" means that the decision about how to parse
> word n can be made on the basis of knowing what word n+1 is, then
> how come it won't work in this case? You get to word 3 & think
> "selbri linkage or sumti linkage?", so you take a look at word 4,
> see it's a sumti and so parse word 3 as sumti linkage.
>
> What actually happens, apparently, is that when you hit word 3
> you're not allowed to see word 4. Can someone explain this to me,
> please? That is, explain what lookahead is.

When the parser hits word 3, it is in the middle of building a selbri, and
is thinking only about a selbri (in this case). It has two choices, shift
or reduce. Shifting joi onto the stack is what is causing the current
problem, you get (le (broda joi)), then the parser looks at word 4, and
the contents of the stack it has built so far, and goes 'oh shit, i
shifted the wrong thing.', so it explodes.

The action that you want, in this case, is for the parser to reduce
(instead of shift), and turn the current stack contents (le (broda)) into
a sumti. Then it will reexamine joi and decide that it will be connecting
two sumti instead of two selbri. (However, YACC always shifts instead of
reducing if there is a conflict.)

Without back-tracking or more look ahead, an *LR parser can't do that,
though.

Hrm. A parser with back tracking or more look ahead (or basically any
parser which can parse a larger class of languages) could take the same
YACC grammar and be able to parse "le broda joi le brode" correctly... I
think...

(You seem to have the idea of lookahead right, in that you look at tokens
up to n+1 to decide what n is, you're just missing out on the way the
parsing represents the partially-parsed data.)

- Jay Kominek <jay.kominek@colorado.edu>
  Plus ça change, plus c'est la même chose