From LOJBAN@CUVMB.BITNET Sat Mar 6 22:45:53 2010 Reply-To: "Carl D. Burke" Sender: Lojban list Date: Fri Dec 1 17:18:40 1995 From: "Carl D. Burke" Subject: Re: LR(k) Lojban Grammar X-To: LOJBAN%CUVMB.BITNET@cunyvm.cuny.edu To: John Cowan Status: OR X-From-Space-Date: Fri Dec 1 17:18:40 1995 X-From-Space-Address: LOJBAN%CUVMB.BITNET@UBVM.CC.BUFFALO.EDU Message-ID: >lojbab: >>But even LR(4) would not eliminate a couple of the lexer/preparser >>constructs, since some (like numbers) are not LR(k) for any k, though >>it would have allowed the grammar to be significantly simpler. > >Do you have a proof of this? In which way do numbers destroy the LR(k) >property? Hmmm... Unfortunately I don't remember an exact reference, but I do remember being taught in some parser-related course (compiler theory or formal languages) that a large number of computer languages are not LR (actually, not context-free) when you parse them from the character level. When you break processing into lexical analysis (usually a regular grammar) followed by a context-free grammar, it works fine. Numbers may or may not cause this problem for lojban. >lojbab: >>It is now probably too late because it would take too much work to >>verify that any given LR(non-1) grammar generated the same Lojban >>grammar, or even anything close. > >Chris: >>Probably true as a practical matter, but eventually knock wood there'll >>be academic interest in the language and we'll want to be able to >>define it in a more theoretically understandable way. So it ought to >>be at least a long-term goal. This assumes that lojban has an LR(k) grammar... I can't find the reference right now, but the last I remember was that the grammar was not actually completely context-free -- there were some shift-reduce conflicts that YACC resolved using precedence of rules/productions. If S-R or R-R conflicts exist, the language is not LR(1), regardless of whether or not YACC can parse it; syntactic ambiguity exists, it's just hidden by the mechanics of the parser. Maybe the grammar has been fixed so that it has no conflicts when YACC is at it's strictest, and we're OK. If the grammar has not been fixed, then a language policy decision has to be made: will lojban have a syntactically unambiguous grammar or not? -- Carl cburke@mitre.org