From lojban+bncCLr6ktCfBBClnbDnBBoE4P6raQ@googlegroups.com Tue Nov 23 10:57:58 2010 Received: from mail-gw0-f61.google.com ([74.125.83.61]) by chain.digitalkingdom.org with esmtp (Exim 4.72) (envelope-from ) id 1PKy3s-0004ZT-Ur; Tue, 23 Nov 2010 10:57:57 -0800 Received: by gwj23 with SMTP id 23sf683672gwj.16 for ; Tue, 23 Nov 2010 10:57:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=beta; h=domainkey-signature:received:x-beenthere:received:received:received :received:received-spf:received:received:received:date:from:to :subject:message-id:mail-followup-to:references:mime-version :in-reply-to:x-original-sender:x-original-authentication-results :reply-to:precedence:mailing-list:list-id:list-post:list-help :list-archive:sender:list-subscribe:list-unsubscribe:content-type :content-disposition; bh=wh0FRQ2QYQ1Iww+iYh2gKc9SC/PpqovVAvEtadMOVgM=; b=Of9wP/stp4ZH2eus6G/dK2N5dUVukdn6EXJMDn4zr8VxbINmccw9piCCpyg+R5MwEl B5eYw85IKcBFKsC1grnFZPxqZ6j/rWdy1hwUiDkW3A3b4t2otcAoeByvj+pffImwdZ1k XQcoLrkFgrCtDVziCvoZBMI3hm2KMNwK/sVSQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlegroups.com; s=beta; h=x-beenthere:received-spf:date:from:to:subject:message-id :mail-followup-to:references:mime-version:in-reply-to :x-original-sender:x-original-authentication-results:reply-to :precedence:mailing-list:list-id:list-post:list-help:list-archive :sender:list-subscribe:list-unsubscribe:content-type :content-disposition; b=Iirv26qmDbpbgg/jRBBW1EhvlK0D9Brre7aifT/8I4Co6S4rn4MOabzRdKJfeYmIxf hiNwMeYd5PmOxEhsZSRr/Y04MSG/vU+zTQzu1PhkjOZ+ShnbnNPGEf05iYPuPf1gnOE7 WOM2eWQt6TxnUMnPUzOEgL/BMz1QYD0ZwwkUk= Received: by 10.101.36.18 with SMTP id o18mr253531anj.64.1290538661567; Tue, 23 Nov 2010 10:57:41 -0800 (PST) X-BeenThere: lojban@googlegroups.com Received: by 10.100.22.18 with SMTP id 18ls1392670anv.1.p; Tue, 23 Nov 2010 10:57:40 -0800 (PST) Received: by 10.100.138.18 with SMTP id l18mr1417843and.40.1290538660363; Tue, 23 Nov 2010 10:57:40 -0800 (PST) Received: by 10.100.138.18 with SMTP id l18mr1417842and.40.1290538660291; Tue, 23 Nov 2010 10:57:40 -0800 (PST) Received: from mail-gx0-f178.google.com (mail-gx0-f178.google.com [209.85.161.178]) by gmr-mx.google.com with ESMTP id b3si1510702ana.6.2010.11.23.10.57.40; Tue, 23 Nov 2010 10:57:40 -0800 (PST) Received-SPF: neutral (google.com: 209.85.161.178 is neither permitted nor denied by best guess record for domain of alanpost@sunflowerriver.org) client-ip=209.85.161.178; Received: by mail-gx0-f178.google.com with SMTP id 23so270857gxk.23 for ; Tue, 23 Nov 2010 10:57:40 -0800 (PST) Received: by 10.91.10.27 with SMTP id n27mr9089963agi.204.1290538659673; Tue, 23 Nov 2010 10:57:39 -0800 (PST) Received: from sunflowerriver.org (c-68-35-167-179.hsd1.nm.comcast.net [68.35.167.179]) by mx.google.com with ESMTPS id b28sm7406311ana.28.2010.11.23.10.57.37 (version=TLSv1/SSLv3 cipher=RC4-MD5); Tue, 23 Nov 2010 10:57:38 -0800 (PST) Date: Tue, 23 Nov 2010 11:57:35 -0700 From: ".alyn.post." To: lojban@googlegroups.com Subject: Re: [lojban] NORATS, SPACE, and PUBLIC in PEG grammar Message-ID: <20101123185735.GF10838@alice.local> Mail-Followup-To: lojban@googlegroups.com References: <20101123180616.GB10838@alice.local> <20101123181027.GQ9301@digitalkingdom.org> <20101123181658.GR9301@digitalkingdom.org> <20101123183210.GD10838@alice.local> <20101123184601.GS9301@digitalkingdom.org> Mime-Version: 1.0 In-Reply-To: <20101123184601.GS9301@digitalkingdom.org> X-Original-Sender: alyn.post@lodockikumazvati.org X-Original-Authentication-Results: gmr-mx.google.com; spf=neutral (google.com: 209.85.161.178 is neither permitted nor denied by best guess record for domain of alanpost@sunflowerriver.org) smtp.mail=alanpost@sunflowerriver.org Reply-To: lojban@googlegroups.com Precedence: list Mailing-list: list lojban@googlegroups.com; contact lojban+owners@googlegroups.com List-ID: List-Post: , List-Help: , List-Archive: Sender: lojban@googlegroups.com List-Subscribe: , List-Unsubscribe: , Content-Type: text/plain; charset=ISO-8859-1 Content-Disposition: inline On Tue, Nov 23, 2010 at 10:46:02AM -0800, Robin Lee Powell wrote: > On Tue, Nov 23, 2010 at 11:32:10AM -0700, .alyn.post. wrote: > > > > I had a brief conversation on the PEG parser mailing list about > > associating code with rules in a PEG grammar. It seems that > > embedding code inside '{}' brackets has become the standard way of > > putting code inside a peg file, but there is no concensus on > > whether that code should execute every time a production is parsed > > (even after a backtrack), only executed the first time but not if > > the rule was rematched after memoization, or only at the end of a > > successful parse. > > > > Some parsers give you a flag or hook to say when code is executed. > > Not having 40 years of history *does* matter sometimes. :) > > > The most compelling case I found was where the 'code' inside '{}' > > brackets was actually more like a tag, and the source code file > > that handled the parse tree was stored separately from the > > grammar. So tags inside '{}' were effectively function calls, but > > could in theory be language independent. > > That would be a nice way to do it, yeah. > > > Do you know off-hand if the lojban grammar has something like this: > > > > expr <- mulexpr [+] mulexpr > > mulexpr <- digits [*] digits > > digits <- [0-9]+ > > > > Where a particular rule (in this case expr and mulexpr) has the > > same non-terminal more than once (mulexpr non-terminal for rule > > expr and digits non-terminal for rule mulexpr)? > > I would be *shocked* if it didn't. Lojban is, as far as anyone > knows, the largest and most complicated regular language grammar > that exists, except possibly the artificially-regular products of > natural language research. > > Ah, here's one: > > terms-1 <- terms-2 (pehe-sa* PEhE-clause free* joik-jek terms-2)* > Great, thank you very much. I've been really bothered about how to associate expressions in a rule with '{}' code, and in theory one could just assign each expression a variable name based on the non-terminal, but if you encounter a non-terminal twice (like terms-2 here) you need some way of renaming one of them. Other than that you can extend PEG grammar to allow the user to specify the name of each expression, or like I said early require the user to use your API to play with the parse tree. I settled on extending the grammar definition by tagging expressions, which pollutes the grammar but makes the code using the grammar easier to write. It doesn't accomplish your goal of not having a bunch of non-grammar garbage in the peg file. :-( > > Also, what does snarf_morph.sh, from the cook file, do? I would > > assume it grabs xorxes' morphology file from lojban.org? I didn't > > see snarf_morph.sh in the rats/ folder. > > It's one level up. > > Here: http://teddyb.org/~rlpowell/hobbies/lojban/grammar/grammar.tgz > > That should simplify things for you. > Immensely, yes. Thank you! -Alan -- .i ko djuno fi le do sevzi -- You received this message because you are subscribed to the Google Groups "lojban" group. To post to this group, send email to lojban@googlegroups.com. To unsubscribe from this group, send email to lojban+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/lojban?hl=en.