From lojban+bncCOjSjrXVGBDr4qvmBBoEpchsFA@googlegroups.com Fri Oct 29 09:08:26 2010 Received: from mail-yw0-f61.google.com ([209.85.213.61]) by chain.digitalkingdom.org with esmtp (Exim 4.72) (envelope-from ) id 1PBrV7-0004e0-Kt; Fri, 29 Oct 2010 09:08:26 -0700 Received: by ywk9 with SMTP id 9sf3527533ywk.16 for ; Fri, 29 Oct 2010 09:08:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=beta; h=domainkey-signature:received:x-beenthere:received:received:received :received:received-spf:received:mime-version:received:received:date :message-id:subject:from:to:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:sender:list-subscribe :list-unsubscribe:content-type; bh=HzbCpm0Dy7MbTF6CB/n39IKIfQv1aRI6pGf7hNjwtsU=; b=5+RWeD9knEsaqyNKaaIwFs0iCgrLkYj1qNVejMzpXNB9s2CemV6bO7SM3N2h5E2hAn 40dhpv0qRGHfBUKnLgOHEoCctWfT93c/A4fThaB5aOHsXbM6QkrP2eaBaD+UwVXyAAL5 KGgWhQVu2v89nC7BV7IhfJ/zdGzf+hGfb9/28= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlegroups.com; s=beta; h=x-beenthere:received-spf:mime-version:date:message-id:subject:from :to:x-original-sender:x-original-authentication-results:reply-to :precedence:mailing-list:list-id:list-post:list-help:list-archive :sender:list-subscribe:list-unsubscribe:content-type; b=DwNVYLrTIDUTJZnxPelxVi5A/0qRXiOl948TjyYBmp5pgU01imHYlqoGfO7ifkKfBO Kk6yKbl4Y6GbX68xwmLJUXX4bmpwZlKQ3SwVjmM4vWd/q6X1247FXRF9NIf6JLvXCMZ8 URq9D8hbUWF7jr7u7W8SDbc9hAIyuaqdr92aE= Received: by 10.151.62.2 with SMTP id p2mr1902565ybk.69.1288368491057; Fri, 29 Oct 2010 09:08:11 -0700 (PDT) X-BeenThere: lojban@googlegroups.com Received: by 10.231.112.41 with SMTP id u41ls3007423ibp.1.p; Fri, 29 Oct 2010 09:08:10 -0700 (PDT) Received: by 10.231.16.141 with SMTP id o13mr1703487iba.1.1288368490414; Fri, 29 Oct 2010 09:08:10 -0700 (PDT) Received: by 10.231.16.141 with SMTP id o13mr1703486iba.1.1288368490379; Fri, 29 Oct 2010 09:08:10 -0700 (PDT) Received: from mail-iw0-f170.google.com (mail-iw0-f170.google.com [209.85.214.170]) by gmr-mx.google.com with ESMTP id bm7si3371916ibb.6.2010.10.29.09.08.09; Fri, 29 Oct 2010 09:08:09 -0700 (PDT) Received-SPF: pass (google.com: domain of lukeabergen@gmail.com designates 209.85.214.170 as permitted sender) client-ip=209.85.214.170; Received: by mail-iw0-f170.google.com with SMTP id 9so3009712iwn.15 for ; Fri, 29 Oct 2010 09:08:09 -0700 (PDT) MIME-Version: 1.0 Received: by 10.231.36.11 with SMTP id r11mr663345ibd.58.1288368489150; Fri, 29 Oct 2010 09:08:09 -0700 (PDT) Received: by 10.231.149.14 with HTTP; Fri, 29 Oct 2010 09:08:09 -0700 (PDT) Date: Fri, 29 Oct 2010 12:08:09 -0400 Message-ID: Subject: [lojban] lujvo deconstruction From: Luke Bergen To: lojban@googlegroups.com X-Original-Sender: lukeabergen@gmail.com X-Original-Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of lukeabergen@gmail.com designates 209.85.214.170 as permitted sender) smtp.mail=lukeabergen@gmail.com; dkim=pass (test mode) header.i=@gmail.com Reply-To: lojban@googlegroups.com Precedence: list Mailing-list: list lojban@googlegroups.com; contact lojban+owners@googlegroups.com List-ID: List-Post: , List-Help: , List-Archive: Sender: lojban@googlegroups.com List-Subscribe: , List-Unsubscribe: , Content-Type: multipart/alternative; boundary=000325550e5a22c3160493c3aaeb --000325550e5a22c3160493c3aaeb Content-Type: text/plain; charset=ISO-8859-1 When I first started learning lojban I wrote up a quick'n dirty script to make looking up words faster and easier. gismu and cmavo were easy, but I could never figure out lujvo. So I'm taking another stab at it. I currently have something that works in the general cases of {bajdri}, {ba'udri}, and {bagypau}. But currently I'm not sure how to deal with 4 letter rafsi and non "y" buffer letters. To deal with the non "y" buffer letters I thought I could just say: strip all "y" from the word get first three non "'" chars if the first letter is "r", "l", "m", or "n" and the second letter is a consonant, then chop off the first letter and grab another letter from the right (so if I was parsing "bacru zei bevri" = "ba'urbei" I would (after handling ba'u in the first iteration) end up with "rbe" and due to the above step, I'd strip off the "r" and grab the next letter thus ending with "bei" which is the right result). But this produces strange results because there ARE cases where buffer letters are followed by consonants (morsi for instance). Is there a way to un-ambiguously and algorithmically break a lujvo down into its component gismu? -- You received this message because you are subscribed to the Google Groups "lojban" group. To post to this group, send email to lojban@googlegroups.com. To unsubscribe from this group, send email to lojban+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/lojban?hl=en. --000325550e5a22c3160493c3aaeb Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable When I first started learning lojban I wrote up a quick'n dirty script = to make looking up words faster and easier. =A0gismu and cmavo were easy, b= ut I could never figure out lujvo. =A0So I'm taking another stab at it.= =A0I currently have something that works in the general cases of {bajdri},= {ba'udri}, and {bagypau}. =A0But currently I'm not sure how to dea= l with 4 letter rafsi and non "y" buffer letters.

To deal with the non "y" buffer letters I thought = I could just say:

strip all "y" from the= word
get first three non "'" chars
if th= e first letter is "r", "l", "m", or "n&q= uot; and the second letter is a consonant, then chop off the first letter a= nd grab another letter from the right
(so if I was parsing "bacru zei bevri" =3D "ba'urbe= i" I would (after handling ba'u in the first iteration) end up wit= h "rbe" and due to the above step, I'd strip off the "r&= quot; and grab the next letter thus ending with "bei" which is th= e right result).

But this produces strange results because there ARE cas= es where buffer letters are followed by consonants (morsi for instance).

Is there a way to un-ambiguously and algorithmically= break a lujvo down into its component gismu?

--
You received this message because you are subscribed to the Google Groups "= lojban" group.
To post to this group, send email to lojban@googlegroups.com.
To unsubscribe from this group, send email to lojban+unsubscribe@googlegrou= ps.com.
For more options, visit this group at http://groups.google.com/group/lojban= ?hl=3Den.
--000325550e5a22c3160493c3aaeb--