From nobody@digitalkingdom.org Thu Aug 06 13:09:08 2009 Received: with ECARTIS (v1.0.0; list lojban-list); Thu, 06 Aug 2009 13:09:08 -0700 (PDT) Received: from nobody by chain.digitalkingdom.org with local (Exim 4.69) (envelope-from ) id 1MZ9Gt-0002hN-Nu for lojban-list-real@lojban.org; Thu, 06 Aug 2009 13:09:08 -0700 Received: from ol.freeshell.org ([192.94.73.20] helo=sdf.lonestar.org) by chain.digitalkingdom.org with esmtp (Exim 4.69) (envelope-from ) id 1MZ9Gp-0002hA-41 for lojban-list@lojban.org; Thu, 06 Aug 2009 13:09:07 -0700 Received: from sdf.lonestar.org (IDENT:jwodder@iceland.freeshell.org [192.94.73.5]) by sdf.lonestar.org (8.14.3/8.14.3) with ESMTP id n76K8uvp005799 for ; Thu, 6 Aug 2009 20:08:56 GMT Received: (from jwodder@localhost) by sdf.lonestar.org (8.14.3/8.12.8/Submit) id n76K8tY6000678 for lojban-list@lojban.org; Thu, 6 Aug 2009 20:08:55 GMT Date: Thu, 6 Aug 2009 20:08:55 +0000 From: Minimiscience To: lojban-list@lojban.org Subject: [lojban] Re: Parsing NIhO sections of text Message-ID: <20090806200854.GA9738@sdf.lonestar.org> Mail-Followup-To: lojban-list@lojban.org References: <86my6csrga.fsf@cmarib.ramside> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <86my6csrga.fsf@cmarib.ramside> Organization: SDF Public Access UNIX System User-Agent: Mutt/1.5.19 (2009-01-05) X-archive-position: 15931 X-ecartis-version: Ecartis v1.0.0 Sender: lojban-list-bounce@lojban.org Errors-to: lojban-list-bounce@lojban.org X-original-sender: minimiscience@gmail.com Precedence: bulk Reply-to: lojban-list@lojban.org X-list: lojban-list de'i li 06 pi'e 08 pi'e 2009 la'o fy. sunrise2000@comcast.net .fy. cusku zoi skamyxatra. > I'm trying to parse out sections of Lojban text delimited by sequences > of NIhO cmavo into their respective paragraphs, sections, chapters, > etc. ... > Does anyone here know how I could use contetx-free grammar rules to > parse the different sections separated by NIhO sequences? > > Any ideas (expressed in EBNF, Prolog, YACC, or whatever you speak) > would be much appreciated! .skamyxatra If the length of a NIhO sequence exceeds the maximum depth of the parse tree/list/structure, can't you just enclose the list in another list until the depths match? E.g., when you have the list X=[[broda, broda], [broda]], and you encounter four NIhOs in a row, let X=[[X]] (two levels of lists because four minus the depth of X is two), and then append to X whatever comes after that. I don't think this can be handled by a CFG without encoding the lengths of the NIhO strings into the productions, which would lead to an infinitely large grammar. Note that the official Yacc and BNF grammars treat a sequence of NIhOs as a single NIhO and leave the structuring of the text to whatever semantic engine comes after. mu'omi'e .kamymecraijun. -- li'a .e'i ca vondei .i mi na'e pu'i kufra loi vondei To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.