From lojban-out@lojban.org Fri Apr 09 03:22:10 2004 Return-Path: X-Sender: lojban-out@lojban.org X-Apparently-To: lojban@yahoogroups.com Received: (qmail 94371 invoked from network); 9 Apr 2004 00:29:52 -0000 Received: from unknown (66.218.66.167) by m1.grp.scd.yahoo.com with QMQP; 9 Apr 2004 00:29:52 -0000 Received: from unknown (HELO chain.digitalkingdom.org) (64.81.49.134) by mta6.grp.scd.yahoo.com with SMTP; 9 Apr 2004 00:29:52 -0000 Received: from lojban-out by chain.digitalkingdom.org with local (Exim 4.30) id 1BBju2-0003WY-To for lojban@yahoogroups.com; Thu, 08 Apr 2004 17:29:50 -0700 Received: from dsl081-049-134.sfo1.dsl.speakeasy.net ([64.81.49.134] helo=chain.digitalkingdom.org) by chain.digitalkingdom.org with esmtp (Exim 4.30) id 1BBjtP-0003Vj-Hx; Thu, 08 Apr 2004 17:29:11 -0700 Received: with ECARTIS (v1.0.0; list lojban-list); Thu, 08 Apr 2004 17:29:07 -0700 (PDT) Received: from rlpowell by chain.digitalkingdom.org with local (Exim 4.30) id 1BBjtC-0003VQ-5j for lojban-list@lojban.org; Thu, 08 Apr 2004 17:28:58 -0700 Date: Thu, 8 Apr 2004 17:28:58 -0700 Message-ID: <20040409002858.GH14789@digitalkingdom.org> Mail-Followup-To: lojban-list@lojban.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.5.1+cvs20040105i X-archive-position: 7502 X-ecartis-version: Ecartis v1.0.0 Sender: lojban-list-bounce@lojban.org Errors-to: lojban-list-bounce@lojban.org X-original-sender: rlpowell@digitalkingdom.org X-list: lojban-list To: lojban@yahoogroups.com X-eGroups-Remote-IP: 64.81.49.134 X-eGroups-From: Robin Lee Powell From: Robin Lee Powell Reply-To: rlpowell@digitalkingdom.org Subject: [lojban] Beta Release of PEG-based Lojban parser. X-Yahoo-Group-Post: member; u=116389790 X-Yahoo-Profile: lojban_out X-Yahoo-Message-Num: 21986 My PEG-based parser now works on almost everything I've thrown at it. Known limitations (from the web page): - Does not handle zoi or la'o, and likely will not handle it in the near future. - Currently its morphology knowledge is very poor. In particular, it does not accept fu'ivla starting with a vowel at this time, nor capital letters in brivla. The parser, information on how it was made, the PEG it was built from, and many other thing are at http://www.digitalkingdom.org/~rlpowell/hobbies/lojban/grammar/index.html The Future: I am considering an extension to allow 'si' or 'sa' at the beginning of text (presumably to erase stuff from the proceeding utterance). The morphology needs massive amounts of work, and ideally I'd like to get Nora and Pierre's full algorithm encoded. I may also hack an extremely minimal pre-processor to do zoi. At some point the parser needs to be taught to output something more useful than just the text it succeeded at parsing, but I'm really hoping someone with actual Java experience will look at that. -Robin -- http://www.digitalkingdom.org/~rlpowell/ *** I'm a *male* Robin. "Many philosophical problems are caused by such things as the simple inability to shut up." -- David Stove, liberally paraphrased. http://www.lojban.org/ *** loi pimlu na srana .i ti rocki morsi