From lojbab Sat Mar 6 22:55:39 2010 Subject: Lojban and LALR(1)ity To: lojban@cuvmb.cc.columbia.edu From: lojbab Date: Fri, 11 Mar 1994 15:13:49 -0500 (EST) Cc: lojbab (Logical Language Group) X-Mailer: ELM [version 2.4 PL23] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Length: 1509 Status: RO X-From-Space-Date: Fri Mar 11 15:13:49 1994 X-From-Space-Address: lojbab Message-ID: Jorge and Matthew have both brought up the question "Why should Lojban be limited to LALR(1) parsability?" The answer is very simple and practical. We have a technique, namely the use of Yacc, for testing if a language meets this specification. Yacc and its relatives (Byacc and Bison) are available on just about every platform. The same cannot be said of any more powerful parser-generator program. History shows that Loglan did not, in fact, become unambiguously machine parsable until the use of Yacc was introduced into the Project. Before that, we had various grammars which were claimed to be unambiguous but in fact failed the test. We already allow small extensions to LALR(1) parsability, through the preprocessor section of the grammar. In particular, the logical connectives themselves (the jeks, joiks, geks, etc.) are not LALR(1). But all of these are essentially bounded; any unboundedness is of a simple iterative, not recursive, kind. (Example: the preparser allows unboundedly long numbers to be processed, but numbers have no internal structure.) Overusing preprocessing is dangerous: the second baseline grammar, in fact, fails to parse "re pamai boi gerku" = "two (firstly) dogs" correctly. This error was completely missed by Yacc, and was caught only by an actual example. Change 6 grew out of this: the new grammar calls for "reboi pamai gerku", which can be parsed correctly. -- John Cowan sharing account for now e'osai ko sarji la lojban.