From nobody@digitalkingdom.org Tue Oct 28 14:51:46 2008 Received: with ECARTIS (v1.0.0; list lojban-list); Tue, 28 Oct 2008 14:51:46 -0700 (PDT) Received: from nobody by chain.digitalkingdom.org with local (Exim 4.69) (envelope-from ) id 1KuwTZ-0002cJ-VH for lojban-list-real@lojban.org; Tue, 28 Oct 2008 14:51:45 -0700 Received: from rlpowell by chain.digitalkingdom.org with local (Exim 4.69) (envelope-from ) id 1KuwTO-0002bS-GY; Tue, 28 Oct 2008 14:51:34 -0700 Date: Tue, 28 Oct 2008 14:51:34 -0700 From: Robin Lee Powell To: lojban-list@lojban.org, lojban-beginners@lojban.org, llg-members@lojban.org Subject: [lojban] A challenge for computer science/programming geeks: The LLG wants to give you $500! Message-ID: <20081028215134.GK31434@digitalkingdom.org> Mail-Followup-To: lojban-list@lojban.org, lojban-beginners@lojban.org, llg-members@lojban.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.18 (2008-05-17) X-archive-position: 14865 X-ecartis-version: Ecartis v1.0.0 Sender: lojban-list-bounce@lojban.org Errors-to: lojban-list-bounce@lojban.org X-original-sender: rlpowell@digitalkingdom.org Precedence: bulk Reply-to: lojban-list@lojban.org X-list: lojban-list Your mission, should you choose to accept it, is to create a CFG for Lojban. There are three options, with different monetary values attached, in order of what we in the LLG board would prefer to get. For $500: Produce a working CFG for Lojban, in any format that some parser generator somewhere can accept, *or* in BNF or a well defined extension thereof. It must support every aspect of the language except for ZOI (which is fully context sensitive). Elidable terminators are the hard part. To stop brute-force solutions that would make the resulting grammar effectively useless, the total number of productions in the CFG language must be less than 2,000. Each alternation counts as a production for this purpose. For comparison, a quick check of the current BNF shows less than 200. *OR* For $300: Produce a CFG that correctly encodes 5 Lojban elidable terminators and all the structures needed to produce sentences using whatever features they terminate, with instructions for how to add more. Two of those must be VAU and KU, which means your sub-language must be able to produce real bridi. The instructions for how to add more must be complete; that is, I should be able to add an elidable terminator and its continget structurs inside of half an hour. You must also demonstrate that the number of productions does not grow much more than linearly; i.e. you must demonstrate that the total number of productions in the resulting CFG will be less than 2,000. Each alternation counts as a production for this purpose. For comparison, a quick check of the current BNF shows less than 200. *OR* For $100: Formally prove that encoding Lojban's elidable terminators is not possible in a CFG. Details: If you produce a grammar, I don't care what parser generator it needs, or even if such a parser generator exists. I care only that the language is actually a CFG, and that a parser generator could, in principle, be built for whatever you came up with. Having said that, I don't think it's formally possible to produce a CFG that (for example) Perl's Parse::RecDescent can't recognize, and the more obscure your CFG is the longer it'll take us to give you the money. No matter what you do, John Cowan and myself will be reviewing it. We are the final arbiters of whether you have met the community's needs, and our vote must be unanimous (this is not a problem: I don't think we've ever disagreed on things of this level of technical detail after we've had a chance to discuss them). You'll probably want to start with the "extended" BNF at http://www.lojban.org/tiki/tiki-index.php?page=Lojban+Formal+Grammars&bl=y It is a nonstandard BNF, in that it has markers for "magic elidable terminators go here". You may want to use http://www.digitalkingdom.org/~rlpowell/hobbies/lojban/grammar/lojban.abnf.txt instead. In case of confusion, the files above take precedence, and the CLL takes precedence over those, but if you have such confusion, you *MUST* explicitely mark it in your report on the work you've done. Your grammar is permitted to not implement the following things: ZOI (impossible in a CFG) SI/SA/SU BU ZEI The reason is that those are Hard; see http://www.lojban.org/tiki/tiki-index.php?page=Magic+Words&bl=y It must, however, be *possible* to implement all of them except ZOI. Example, simplified implementations of each is therefore encouraged. *********** * WARNING * *********** I and several other people have looked at this problem and strongly believe that it is impossible to encode Lojban elidable terminators in a CFG, at least without an exponential in the number of terminators number of productions. -Robin -- They say: "The first AIs will be built by the military as weapons." And I'm thinking: "Does it even occur to you to try for something other than the default outcome?" -- http://shorl.com/tydruhedufogre http://www.digitalkingdom.org/~rlpowell/ *** http://www.lojban.org/ To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.