From lojban+bncCK30vq5WEJ-Us-cEGgROusTj@googlegroups.com Wed Nov 24 00:17:49 2010 Received: from mail-pz0-f61.google.com ([209.85.210.61]) by chain.digitalkingdom.org with esmtp (Exim 4.72) (envelope-from ) id 1PLAXw-0006ZA-37; Wed, 24 Nov 2010 00:17:48 -0800 Received: by pzk7 with SMTP id 7sf368304pzk.16 for ; Wed, 24 Nov 2010 00:17:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=beta; h=domainkey-signature:received:x-beenthere:received:received:received :received:received-spf:received:date:from:to:subject:message-id :references:mime-version:in-reply-to:user-agent:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:sender:list-subscribe :list-unsubscribe:content-type:content-disposition; bh=ASweI6KuZHdpUvv3Li0AAlemn3MEq31ImYUeONNKZpA=; b=bDY018OI2EbGiOQ94LHgPH/dXmzETu18G9iphRVqqXI3/cqCiWTkxnVoXFDhEv0r6n KgMVjt/n4oMuN38Ay/5exeIOeg/PjG/Javg6XdTMjSzytuqJU5SanTaWVoH/tSTOjkfo qgKD3B4vYB6aDrhf6X9leq5ib5f4sGrmWGUmE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlegroups.com; s=beta; h=x-beenthere:received-spf:date:from:to:subject:message-id:references :mime-version:in-reply-to:user-agent:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:sender:list-subscribe :list-unsubscribe:content-type:content-disposition; b=53P7uSVZKGzeBqsMmchRaJQJFjXjTFwG8xbybdmsdLIguNcMYGMkRzydoCWwSzcvD2 NUdvbrJ62Qxlw+a9MXcKzjeoHlorouizswWQV4YsqTIph7P1jxyz668KwDEOBcWzLo2a fzCvfoxMAmqkkeNdqjiqO3cL+GfCxEecsIznw= Received: by 10.143.153.7 with SMTP id f7mr443251wfo.55.1290586655635; Wed, 24 Nov 2010 00:17:35 -0800 (PST) X-BeenThere: lojban@googlegroups.com Received: by 10.142.2.41 with SMTP id 41ls648192wfb.0.p; Wed, 24 Nov 2010 00:17:34 -0800 (PST) Received: by 10.142.161.19 with SMTP id j19mr85668wfe.39.1290586653935; Wed, 24 Nov 2010 00:17:33 -0800 (PST) Received: by 10.142.161.19 with SMTP id j19mr85667wfe.39.1290586653837; Wed, 24 Nov 2010 00:17:33 -0800 (PST) Received: from chain.digitalkingdom.org (digitalkingdom.org [173.13.139.234]) by gmr-mx.google.com with ESMTP id y8si10220656wfj.1.2010.11.24.00.17.33; Wed, 24 Nov 2010 00:17:33 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of rlpowell@digitalkingdom.org designates 173.13.139.234 as permitted sender) client-ip=173.13.139.234; Received: from rlpowell by chain.digitalkingdom.org with local (Exim 4.72) (envelope-from ) id 1PLAXl-0006Yt-F5 for lojban@googlegroups.com; Wed, 24 Nov 2010 00:17:33 -0800 Date: Wed, 24 Nov 2010 00:17:33 -0800 From: Robin Lee Powell To: lojban@googlegroups.com Subject: Re: [lojban] NORATS, SPACE, and PUBLIC in PEG grammar Message-ID: <20101124081733.GF9301@digitalkingdom.org> References: <20101123180616.GB10838@alice.local> <20101123181027.GQ9301@digitalkingdom.org> <20101123181658.GR9301@digitalkingdom.org> <20101123183210.GD10838@alice.local> <20101123184601.GS9301@digitalkingdom.org> <20101123185735.GF10838@alice.local> <20101123190215.GW9301@digitalkingdom.org> <20101123192523.GH10838@alice.local> MIME-Version: 1.0 In-Reply-To: <20101123192523.GH10838@alice.local> User-Agent: Mutt/1.5.20 (2009-06-14) X-Original-Sender: rlpowell@digitalkingdom.org X-Original-Authentication-Results: gmr-mx.google.com; spf=pass (google.com: best guess record for domain of rlpowell@digitalkingdom.org designates 173.13.139.234 as permitted sender) smtp.mail=rlpowell@digitalkingdom.org Reply-To: lojban@googlegroups.com Precedence: list Mailing-list: list lojban@googlegroups.com; contact lojban+owners@googlegroups.com List-ID: List-Post: , List-Help: , List-Archive: Sender: lojban@googlegroups.com List-Subscribe: , List-Unsubscribe: , Content-Type: text/plain; charset=ISO-8859-1 Content-Disposition: inline On Tue, Nov 23, 2010 at 12:25:23PM -0700, .alyn.post. wrote: > The bootstrap compiler is compiling the morphology and morphology > header file, but I'm still working on the peg grammar itself. Damn. That's a lot of work; good luck! > Given that Lojban is used as an example of a complex PEG grammar: > > http://en.wikipedia.org/wiki/Parsing_expression_grammar#External_links Lojban is almost certainly the most complex fully regular (except ZOi) grammar in actual use in the world. The only time you might get something worse is regularized versions of natlang grammars. Lojban's grammar is something like 10x the size of most programming languages. > I'm not sure it's a bad idea to have a peg parser generator > written specifically to parse Lojban. It's certainly a great test-to-destruction choice. :) Throw the entirety of {la .alis.} at it in one pass, for example. :) > I do wish there had been something available already, but I'm not > aware of Scheme code that parsers PEG files--they all seem to want > to write the grammar definition in Scheme itself. Well, you could always write a pre-processor to output Scheme from a common PEG format. Honestly, whatever we end up with in terms of the PEG grammar we declare as the formalized This ... Is ... Lojban!!! (assuming we do so), it's going to be "wrong" in the sense that you'll have to process it to get a working input file for whatever parser generator you're *actually* using. I don't really see any way to avoid that, although the NORATS and so on were intended to encode some meta-parser sorts of information about certain productions. -Robin -- http://singinst.org/ : Our last, best hope for a fantastic future. Lojban (http://www.lojban.org/): The language in which "this parrot is dead" is "ti poi spitaki cu morsi", but "this sentence is false" is "na nei". My personal page: http://www.digitalkingdom.org/rlp/ -- You received this message because you are subscribed to the Google Groups "lojban" group. To post to this group, send email to lojban@googlegroups.com. To unsubscribe from this group, send email to lojban+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/lojban?hl=en.