Re: [lojban] Testing Lua/LPeg version of the Lojban PEG

Subject: Re: [lojban] Testing Lua/LPeg version of the Lojban PEG

From: Veijo Vilva <veijo.vilva@gmail.com>

Date: Sat, 23 Jun 2012 17:29:05 +0300

Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=beta; h=x-beenthere:received-spf:mime-version:in-reply-to:references:date :message-id:subject:from:to:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:x-google-group-id:list-post:list-help:list-archive:sender :list-subscribe:list-unsubscribe:content-type; bh=n0YkrPwTLEympG8Is+sTST4OivA1TWqPCCzhiAzPPmY=; b=ca0n+MwDPheVkjmkT/lyz6yF4tY4R6U74+FFnQ9lRy2L5qom3yQQ44QmDe8i2uSvrW n9MfQarzmhB6UtIBcXc4G7msvaiDiNBksKKTLigP8rAcLMUHw1wXNftCkmIHqcoRQGV8 Zz6cKe1oYcgovOxgdV9OgiLQR0UDo2Hsca+TQ=

In-reply-to: <20120621162732.GY1445@stodi.digitalkingdom.org>

List-archive: <http://groups.google.com/group/lojban?hl=en_US>

List-help: <http://groups.google.com/support/?hl=en_US>, <mailto:lojban+help@googlegroups.com>

List-id: <lojban.googlegroups.com>

List-post: <http://groups.google.com/group/lojban/post?hl=en_US>, <mailto:lojban@googlegroups.com>

List-subscribe: <http://groups.google.com/group/lojban/subscribe?hl=en_US>, <mailto:lojban+subscribe@googlegroups.com>

List-unsubscribe: <http://groups.google.com/group/lojban/subscribe?hl=en_US>, <mailto:googlegroups-manage+1004133512417+unsubscribe@googlegroups.com>

Mailing-list: list lojban@googlegroups.com; contact lojban+owners@googlegroups.com

References: <CA+4zpkTpHHpVAtEZ_gV3+cQJZeg8ySnQ68KdSg60tU-epkq=zQ@mail.gmail.com> <20120621162732.GY1445@stodi.digitalkingdom.org>

Reply-to: lojban@googlegroups.com

Sender: lojban@googlegroups.com

On 21 June 2012 19:27, Robin Lee Powell <rlpowell@digitalkingdom.org> wrote:

On Thu, Jun 21, 2012 at 11:08:21AM +0300, Veijo Vilva wrote:
> I've also added rules to bracket sumti tcita and
> zei lujvo. I had to add rules to the morphology PEG in order to
> keep any quoted non-Lojban text intact - now the quoted text is
> sent as a single non-L word to the grammar PEG.

For all your changes that you believe do not change the language,
can you comment on them at
http://www.lojban.org/tiki/BPFK+Section%3A+Formal+Grammar ?

Mere bracketing shouldn't change the language, but I'll, of course, triple check additions like this before submitting them.

> My present, very simple pretty printer is quite flexible. It can
> produce either the full parse tree, which is probably required
> only for checking the parser, or omit the numbered sub-rules
> (sumti-1,...) or omit any user-defined set of intermediate levels
> from the tree. It would be trivial to add glosses for cmavo and
> gismu to the output. I've also given some thought to passing the
> lujvo split from the morphology PEG.

FWIW, the trick that I've used for programmatic tree pruning, that
works very well, is to prune anything that has only one child.
That, IIRC, is the entire difference between camxes and camxes -v.

That trick definitely simplifies the exclusion rules but don't necessarily make them completely superfluous, and sometimes an additional inclusion list may also help.

> I'll have to do some more testing before releasing the program for
> general consumption.

You might as well throw it up on github for people to play with,
no?

I'll check the PEG -> LPeg translation first as it is all too easy make a few mistakes when mechanically replacing hundreds of concatenations and slashes with the corresponding LPeg operators. The rules which I've rewritten for speed reasons must be triple checked for logic errors - or commented out and replaced with the original versions at this stage.

Veijo

web site: http://galactinus.net/vilva/

on Google+: https://plus.google.com/106533767817816079660/posts