Received: from mail-qa0-f62.google.com ([209.85.216.62]:38075) by stodi.digitalkingdom.org with esmtps (TLSv1:RC4-SHA:128) (Exim 4.76) (envelope-from ) id 1SiRKr-0007Ca-Iy; Sat, 23 Jun 2012 07:29:25 -0700 Received: by qaea17 with SMTP id a17sf1393822qae.27 for ; Sat, 23 Jun 2012 07:29:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=beta; h=x-beenthere:received-spf:mime-version:in-reply-to:references:date :message-id:subject:from:to:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:x-google-group-id:list-post:list-help:list-archive:sender :list-subscribe:list-unsubscribe:content-type; bh=n0YkrPwTLEympG8Is+sTST4OivA1TWqPCCzhiAzPPmY=; b=ca0n+MwDPheVkjmkT/lyz6yF4tY4R6U74+FFnQ9lRy2L5qom3yQQ44QmDe8i2uSvrW n9MfQarzmhB6UtIBcXc4G7msvaiDiNBksKKTLigP8rAcLMUHw1wXNftCkmIHqcoRQGV8 Zz6cKe1oYcgovOxgdV9OgiLQR0UDo2Hsca+TQ= Received: by 10.52.71.7 with SMTP id q7mr335936vdu.20.1340461747052; Sat, 23 Jun 2012 07:29:07 -0700 (PDT) X-BeenThere: lojban@googlegroups.com Received: by 10.52.69.174 with SMTP id f14ls1205253vdu.2.gmail; Sat, 23 Jun 2012 07:29:06 -0700 (PDT) Received: by 10.52.88.174 with SMTP id bh14mr4315399vdb.6.1340461746057; Sat, 23 Jun 2012 07:29:06 -0700 (PDT) Received: by 10.52.88.174 with SMTP id bh14mr4315398vdb.6.1340461746041; Sat, 23 Jun 2012 07:29:06 -0700 (PDT) Received: from mail-vc0-f174.google.com (mail-vc0-f174.google.com [209.85.220.174]) by gmr-mx.google.com with ESMTPS id y20si11076722vdd.0.2012.06.23.07.29.06 (version=TLSv1/SSLv3 cipher=OTHER); Sat, 23 Jun 2012 07:29:06 -0700 (PDT) Received-SPF: pass (google.com: domain of veijo.vilva@gmail.com designates 209.85.220.174 as permitted sender) client-ip=209.85.220.174; Received: by vcbf11 with SMTP id f11so1310248vcb.19 for ; Sat, 23 Jun 2012 07:29:06 -0700 (PDT) MIME-Version: 1.0 Received: by 10.52.70.116 with SMTP id l20mr2863590vdu.19.1340461745890; Sat, 23 Jun 2012 07:29:05 -0700 (PDT) Received: by 10.52.159.193 with HTTP; Sat, 23 Jun 2012 07:29:05 -0700 (PDT) In-Reply-To: <20120621162732.GY1445@stodi.digitalkingdom.org> References: <20120621162732.GY1445@stodi.digitalkingdom.org> Date: Sat, 23 Jun 2012 17:29:05 +0300 Message-ID: Subject: Re: [lojban] Testing Lua/LPeg version of the Lojban PEG From: Veijo Vilva To: lojban@googlegroups.com X-Original-Sender: veijo.vilva@gmail.com X-Original-Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of veijo.vilva@gmail.com designates 209.85.220.174 as permitted sender) smtp.mail=veijo.vilva@gmail.com; dkim=pass header.i=@gmail.com Reply-To: lojban@googlegroups.com Precedence: list Mailing-list: list lojban@googlegroups.com; contact lojban+owners@googlegroups.com List-ID: X-Google-Group-Id: 1004133512417 List-Post: , List-Help: , List-Archive: Sender: lojban@googlegroups.com List-Subscribe: , List-Unsubscribe: , Content-Type: multipart/alternative; boundary=20cf307abcff32e74204c3249266 X-Spam-Score: -0.7 (/) X-Spam_score: -0.7 X-Spam_score_int: -6 X-Spam_bar: / --20cf307abcff32e74204c3249266 Content-Type: text/plain; charset=ISO-8859-1 On 21 June 2012 19:27, Robin Lee Powell wrote: > On Thu, Jun 21, 2012 at 11:08:21AM +0300, Veijo Vilva wrote: > > I've also added rules to bracket sumti tcita and > > zei lujvo. I had to add rules to the morphology PEG in order to > > keep any quoted non-Lojban text intact - now the quoted text is > > sent as a single non-L word to the grammar PEG. > > For all your changes that you believe do not change the language, > can you comment on them at > http://www.lojban.org/tiki/BPFK+Section%3A+Formal+Grammar ? Mere bracketing shouldn't change the language, but I'll, of course, triple check additions like this before submitting them. > > My present, very simple pretty printer is quite flexible. It can > > produce either the full parse tree, which is probably required > > only for checking the parser, or omit the numbered sub-rules > > (sumti-1,...) or omit any user-defined set of intermediate levels > > from the tree. It would be trivial to add glosses for cmavo and > > gismu to the output. I've also given some thought to passing the > > lujvo split from the morphology PEG. > > FWIW, the trick that I've used for programmatic tree pruning, that > works very well, is to prune anything that has only one child. > That, IIRC, is the entire difference between camxes and camxes -v. That trick definitely simplifies the exclusion rules but don't necessarily make them completely superfluous, and sometimes an additional inclusion list may also help. > > I'll have to do some more testing before releasing the program for > > general consumption. > > You might as well throw it up on github for people to play with, > no? > I'll check the PEG -> LPeg translation first as it is all too easy make a few mistakes when mechanically replacing hundreds of concatenations and slashes with the corresponding LPeg operators. The rules which I've rewritten for speed reasons must be triple checked for logic errors - or commented out and replaced with the original versions at this stage. Veijo -- web site: http://galactinus.net/vilva/ on Google+: https://plus.google.com/106533767817816079660/posts -- You received this message because you are subscribed to the Google Groups "lojban" group. To post to this group, send email to lojban@googlegroups.com. To unsubscribe from this group, send email to lojban+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/lojban?hl=en. --20cf307abcff32e74204c3249266 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
On 21 June 2012 19:27, Robin Lee Powell = <rlpowe= ll@digitalkingdom.org> wrote:
<= blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px= #ccc solid;padding-left:1ex">
On Thu, Jun 21, 2012 at 11:08:21AM +0300, Veijo Vilva wro= te:
> I've also added =A0rules to bracket sumti tcita and
> zei =A0lujvo. I had to add rules to the morphology PEG in order to
> keep any quoted non-Lojban text intact - now the quoted text is
> sent as a single non-L word to the grammar PEG.

For all your changes that you believe do not change the language,
can you comment on them at
http://www.lojban.org/tiki/BPFK+Section%3A+Formal+Grammar = ?

Mere bracketing shouldn't change the = language, but I'll, of course, triple check additions like this before = submitting them.
=A0=A0
> My present, very simple pretty printer is quite flexible. It can
> produce either the full parse tree, which is probably required
> only for checking the parser, or omit the numbered sub-rules
> (sumti-1,...) or omit any user-defined set of intermediate levels
> from the tree. It would be trivial to add glosses for cmavo and
> gismu to the output. I've also given some thought to passing the > lujvo split from the morphology PEG.

FWIW, the trick that I've used for programmatic tree pruning, tha= t
works very well, is to prune anything that has only one child.
That, IIRC, is the entire difference between camxes and camxes -v.

That trick definitely simplifies the exclusion rule= s but don't necessarily make them completely superfluous, and sometimes= an additional inclusion list may also help.
=A0
> I'll have to do some more testing before releasing the program for=
> general consumption.

You might as well throw it up on github for people to play with,
no?

I'll check the PEG -> LPeg t= ranslation first as it is all too easy make a few mistakes when mechanicall= y replacing hundreds of concatenations and slashes with the corresponding L= Peg operators. =A0The rules which I've rewritten for speed reasons must= be triple checked for logic errors - or commented out and replaced with th= e original versions at this stage.

=A0 =A0 Veijo

--

--
You received this message because you are subscribed to the Google Groups "= lojban" group.
To post to this group, send email to lojban@googlegroups.com.
To unsubscribe from this group, send email to lojban+unsubscribe@googlegrou= ps.com.
For more options, visit this group at http://groups.google.com/group/lojban= ?hl=3Den.
--20cf307abcff32e74204c3249266--