[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[lojban] Re: Official parser problem?



On Wed, Mar 17, 2004 at 10:13:45PM -0500, John Cowan wrote:
> Robin Lee Powell scripsit:
> 
> > John is saying one of two things:
> >
> > 1.  You are wrong in your reading of the grammar.  That sentence
> > definately fails because [explanation].
> 
> That's the idea.  A parser that gets this wrong is wrong, as the
> language is currently specified.

Granted.

> > 2.  You are wrong in your reading of the grammar.  That sentence
> > definately fails because [explanation].  This is a good thing.
> 
> I hadn't gotten that far.  I agree that it's damned unintuitive --
> people internalize the grammar of "si" based on words, not tokens. I
> don't know if you were there when I kept trying to quote something
> with zo, and kept saying zoi -- this is a disaster in the current
> language.

<shudder>

> > It may be that John was just describing the current reality, and not
> > assigning a value judgement at all, in which case I hope he will
> > accept my apology for freaking out.
> 
> No problem.
> 
> The current situation is at least consistent even if stupid.  I'd be
> open to other ideas that would be less so.

Oh *believe* me, I'm working on it.

<geeking out>

The grammar that I'm currently working on is *not* a CFG.  It's called a
PEG (for Parsing [1] Exprission Grammar).  It is fully formalized (i.e.
formalized enough to make mathematical proofs related to it).  It is
unambiguous (i.e. the way the system is defined, there can only ever be
one parse of any string).  It has been proven to be at least as powerful
as LR(k) or LL(k), for any k up to infinity.  It can encode at least
some context-sensitive languages.  It can without question do Lojban (I
know this because I've been doing test runs all day).  It has infinite
look-ahead and look-behind.  

In linear time.

Memory usage is a bit bad, but whatever.

My goal is to produce a completely formalized grammar for Lojban
(*including* the morphology) that has no pre-processor at all[2] and
encodes something close enough to the current grammar as to invalidate
no existing usage.

Because no pre-processor is involved, certain things will inherently[3]
end up different, like the interaction of si and lo'u...le'u, but I'm
certain I can produce something that will be *better*.

Then I'll see about taking it to the BPFK, once I've had a few of the
smarter people here look over it.  I'm being very careful to document
everything I do.

-Robin

[1]: bo
[2]: Actually, without an extension to PEGs, 'zoi' cannot be handled
without a pre-processor, and without a re-definition that is at least
marginally sane, 'sa' doesn't even have a working definition to try to
handle.
[3]: OK, well, I *could* replicate current behaviour, but as current
behaviour is based on the technical limitations of YACC and does not,
for example, resemble what is described in the book, why would I do
that?

</geeking out>


-- 
Me: http://www.digitalkingdom.org/~rlpowell/  ***   I'm a *male* Robin.
"Constant neocortex override is the only thing that stops us all
from running out and eating all the cookies."  -- Eliezer Yudkowsky
http://www.lojban.org/             ***              .i cimo'o prali .ui