[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[lojban] Re: y: what is it good for?



On Thu, May 13, 2004 at 10:45:31PM +0100, Zefram wrote:
> I spoke of "the erasure system" from the point of view of a speaker of
> the language, to whom they do appear unified.  Any way I'd write a
> Lojban parser would also share a lot of structure.

I assume s/would/they would/.

The can't.  Please trust me on this; I understand that it looks from a
distance like they can, but it's simply not the case.

> Not wishing to get your hopes up unreasonably, but I think I can
> improve on all your current parsers, in this and other areas.  

I don't think you have the background to understand how potentially
offensive to me that was, so I will not treat it as such.

The reason this discussion is occuring is because I am currently
devoting a huge amount of my free time to writing a parser that doesn't
suck, using a grammatical formalism that is actually capable of handling
Lojban.  See

http://www.digitalkingdom.org/~rlpowell/hobbies/lojban/grammar/

> I'm quite good with machine grammars and parsers.  I've witnessed the
> problems you've had from trying to bend yacc to the task, and I'm not
> surprised you've had difficulty, it's simply not up to the task.
> Attempts so far also don't seem to have made the preprocessing stages
> sufficiently distinct from the primary parsing.  It's difficult to get
> that sort of thing right in a complicated parser, especially the first
> time round.

I don't think you understand: Lojban is not context free, even with
pre-processing.  At least, everyone who has taken a serious look at it
believes that it's not; we have no proof though.

First of all, there's zoi, which is not context free in exactly the same
way that C variable declarations are not context free.  That's not a big
deal.

Elidable terminators, however, are another matter entirely.  I've tried
to make the elidable terminators work in pure BNF, really I have.  It
*may* be possible, but I wouldn't bet on it, and if it *is* possible, it
will require increasing the size of the grammar by *at* *least* a factor
of 10.  I suspect it would be closer to a factor of 50.

That's not a joke or an exaggeration: to make elidable terminators work
in BNF you need a seperate case for every possible interaction of every
elidable terminator, and every production they reference.  Even then,
I'm not sure it's possible at all.  I'd bet against it, even.

If you want to try it, grab a copy of one of the versions of the BNF on
the page I linked above and I'll send you a couple of the NUhI examples
I was working on.  Quite frankly, though, we could use your abilities
better elsewhere.

It turns out, however, that elidable terminators are trivial in a
grammatical formalism with left-to-right precedence and greedy matching
of repitition.  This is how I ended up using Parsing Expression Grammars
(http://www.pdos.lcs.mit.edu/~baford/packrat/) for my grammar re-write.

Bear in mind that when the YACC grammar was written, I don't even think
ANSI C had been completed.  This was so long ago that the LLG got a free
version of a commercial YACC because they kept finding bugs in the YACC
implementation itself, because much of what they were doing had never
been done before.

As a final comment, please be aware that the official grammar
(grammar.300) is the official grammar and that everything in these
discussions of what my grammar should be doing is *completely* outside
Real Lojban (tm), at least for the time being.

-Robin

-- 
http://www.digitalkingdom.org/~rlpowell/  ***  I'm a *male* Robin.
"Many philosophical problems are caused by such things as the simple
inability to shut up." -- David Stove, liberally paraphrased.
http://www.lojban.org/  ***  loi pimlu na srana .i ti rokci morsi