[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [lojban] NORATS, SPACE, and PUBLIC in PEG grammar
On Tue, Nov 23, 2010 at 10:46:02AM -0800, Robin Lee Powell wrote:
> On Tue, Nov 23, 2010 at 11:32:10AM -0700, .alyn.post. wrote:
> >
> > I had a brief conversation on the PEG parser mailing list about
> > associating code with rules in a PEG grammar. It seems that
> > embedding code inside '{}' brackets has become the standard way of
> > putting code inside a peg file, but there is no concensus on
> > whether that code should execute every time a production is parsed
> > (even after a backtrack), only executed the first time but not if
> > the rule was rematched after memoization, or only at the end of a
> > successful parse.
> >
> > Some parsers give you a flag or hook to say when code is executed.
>
> Not having 40 years of history *does* matter sometimes. :)
>
> > The most compelling case I found was where the 'code' inside '{}'
> > brackets was actually more like a tag, and the source code file
> > that handled the parse tree was stored separately from the
> > grammar. So tags inside '{}' were effectively function calls, but
> > could in theory be language independent.
>
> That would be a nice way to do it, yeah.
>
> > Do you know off-hand if the lojban grammar has something like this:
> >
> > expr <- mulexpr [+] mulexpr
> > mulexpr <- digits [*] digits
> > digits <- [0-9]+
> >
> > Where a particular rule (in this case expr and mulexpr) has the
> > same non-terminal more than once (mulexpr non-terminal for rule
> > expr and digits non-terminal for rule mulexpr)?
>
> I would be *shocked* if it didn't. Lojban is, as far as anyone
> knows, the largest and most complicated regular language grammar
> that exists, except possibly the artificially-regular products of
> natural language research.
>
> Ah, here's one:
>
> terms-1 <- terms-2 (pehe-sa* PEhE-clause free* joik-jek terms-2)*
>
Great, thank you very much. I've been really bothered about how to
associate expressions in a rule with '{}' code, and in theory one
could just assign each expression a variable name based on the
non-terminal, but if you encounter a non-terminal twice (like terms-2
here) you need some way of renaming one of them.
Other than that you can extend PEG grammar to allow the user to
specify the name of each expression, or like I said early require
the user to use your API to play with the parse tree.
I settled on extending the grammar definition by tagging expressions,
which pollutes the grammar but makes the code using the grammar
easier to write. It doesn't accomplish your goal of not having a
bunch of non-grammar garbage in the peg file. :-(
> > Also, what does snarf_morph.sh, from the cook file, do? I would
> > assume it grabs xorxes' morphology file from lojban.org? I didn't
> > see snarf_morph.sh in the rats/ folder.
>
> It's one level up.
>
> Here: http://teddyb.org/~rlpowell/hobbies/lojban/grammar/grammar.tgz
>
> That should simplify things for you.
>
Immensely, yes. Thank you!
-Alan
--
.i ko djuno fi le do sevzi
--
You received this message because you are subscribed to the Google Groups "lojban" group.
To post to this group, send email to lojban@googlegroups.com.
To unsubscribe from this group, send email to lojban+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/lojban?hl=en.
- References:
- [lojban] NORATS, SPACE, and PUBLIC in PEG grammar
- From: ".alyn.post." <alyn.post@lodockikumazvati.org>
- Re: [lojban] NORATS, SPACE, and PUBLIC in PEG grammar
- From: Robin Lee Powell <rlpowell@digitalkingdom.org>
- Re: [lojban] NORATS, SPACE, and PUBLIC in PEG grammar
- From: Robin Lee Powell <rlpowell@digitalkingdom.org>
- Re: [lojban] NORATS, SPACE, and PUBLIC in PEG grammar
- From: ".alyn.post." <alyn.post@lodockikumazvati.org>
- Re: [lojban] NORATS, SPACE, and PUBLIC in PEG grammar
- From: Robin Lee Powell <rlpowell@digitalkingdom.org>