From seidensticker@msn.com Fri Mar 09 22:19:57 2001 Return-Path: X-Sender: seidensticker@msn.com X-Apparently-To: lojban@yahoogroups.com Received: (EGP: mail-7_0_4); 10 Mar 2001 06:19:57 -0000 Received: (qmail 42506 invoked from network); 10 Mar 2001 06:19:57 -0000 Received: from unknown (10.1.10.26) by l10.egroups.com with QMQP; 10 Mar 2001 06:19:57 -0000 Received: from unknown (HELO ej.egroups.com) (10.1.10.49) by mta1 with SMTP; 10 Mar 2001 06:19:56 -0000 X-eGroups-Return: seidensticker@msn.com Received: from [10.1.10.98] by ej.egroups.com with NNFMP; 10 Mar 2001 06:19:56 -0000 Date: Sat, 10 Mar 2001 06:19:54 -0000 To: lojban@yahoogroups.com Subject: Re: How do you parse lujvo into the component rafsi? Message-ID: <98ch2a+10601@eGroups.com> In-Reply-To: <002d01c0a920$b720c680$825681ce@wlink.net> User-Agent: eGroups-EW/0.82 MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Length: 1991 X-Mailer: eGroups Message Poster X-Originating-IP: 206.129.86.130 From: seidensticker@msn.com X-Yahoo-Message-Num: 5748 Assuming the grammar below is correct, I've made the following algorithm, which I assume to be equivalent. if the remainder of the string begins CVVr or CVVn or CVV or CVCy then chop off that token and recurse else if the remainder begins CCV then if the remainder begins Cy then chop off the CCVCy and recurse else if the remainder begins CV then chop off the terminal CCVCV and end else chop off the CCV and recurse else if the remainder begins CVC then if the remainder begins Cy then chop off the CVCCy and recurse else if the remainder begins CV then chop off the terminal CVCCV and end else chop off the CVC and recurse Does this sound like the correct way to parse a lujvo into rafsi tokens? --- In lojban@y..., "seidensticker" wrote: > I'm working on an algorithm for breaking a lujvo into its component parts. (My goal: given an unknown lujvo, break it up into parts and display the definitions of each of those parts.) Chapter 4, section 11 of the grammar book ("The lujvo-making algorithm") talks about creating lujvo, but my question is about the reverse. Is there a place where this is simply described? > > If there's not, let me try this: I've tried to compose a grammar that defines a lujvo. Could someone critique it? > > lujvo = InitialRafsi TermainlRafsi > InitialRafsi = Rafsi InitialRafsi | > Rafsi = 4Rafsi | 3Rafsi > > TerminalRafsi = CCV | CVV | CVCCV | CCVCV > 4Rafsi = CVCCy | CCVCy > 3Rafsi = CVV | CCV | CVVr | CVVn | CVC | CVCy > > Must the parsing of the unknown lujvo begin from the right? or left? or is it unambiguous regardless? Given a 4Rafsi of the form CVCCy or CCVCy, I'm assuming that there's only one gismu with those first 4 letters -- right? > > Any other suggestions for how to do the parsing? > > Thanks.