From egroups@solipsys.co.uk Sat Mar 10 03:43:27 2001
Return-Path: <the_wrights@solipsys.co.uk>
X-Sender: the_wrights@solipsys.co.uk
X-Apparently-To: lojban@yahoogroups.com
Received: (EGP: mail-7_0_4); 10 Mar 2001 11:43:27 -0000
Received: (qmail 2707 invoked from network); 10 Mar 2001 11:43:26 -0000
Received: from unknown (10.1.10.27) by l9.egroups.com with QMQP; 10 Mar 2001 11:43:26 -0000
Received: from unknown (HELO sulphur.cix.co.uk) (212.35.225.149) by mta2 with SMTP; 10 Mar 2001 11:43:26 -0000
Received: from s30.pool.pm3-tele-6.cix.co.uk (s30.pool.pm3-tele-6.cix.co.uk [194.153.24.150]) by sulphur.cix.co.uk (8.11.3/CIX/8.11.3) with SMTP id f2ABhM828049 for <lojban@yahoogroups.com>; Sat, 10 Mar 2001 11:43:23 GMT
X-Envelope-From: the_wrights@solipsys.co.uk
Message-Id: <200103101143.f2ABhM828049@sulphur.cix.co.uk>
Comments: Authenticated sender is <solipsys@mail.compulink.co.uk>
To: lojban@yahoogroups.com
Date: Sat, 10 Mar 2001 11:47:22 +0000
MIME-Version: 1.0
Content-type: text/plain; charset=US-ASCII
Content-transfer-encoding: 7BIT
Subject: Re: [lojban] Parsing lujvo
Reply-to: c.d.wright@solipsys.co.uk
Priority: normal
X-mailer: Pegasus Mail for Windows (v2.31)
X-eGroups-From: the_wrights@solipsys.co.uk
From: egroups@solipsys.co.uk

> Date: Fri, 9 Mar 2001 21:12:33 -0800
> From: "seidensticker" <seidensticker@msn.com>
> Subject: How do you parse lujvo into the component rafsi?
> 
> I'm working on an algorithm for breaking a lujvo into its
> component parts. (My goal: given an unknown lujvo, break
> it up into parts and display the definitions of each of
> those parts.)

I have a set of programs which, given a (grammatically correct)
lojban utterance, generates a gloss of it. This includes having
the bracketing of the original to see the grammatical structure,
and looking up the word-for-word "translations" of the words,
including breaking up lujvo and finding the definitions of each
component. It was heavily critised by everyone who tried it,
apparently because it's text based, doesn't have pretty colours,
and runs under DOS. However, I now have versions that run under
Linux, NetBSD, RiscOS and Solaris, although it's still entirely
text based. I use it all the time, the main problem being that
it doesn't cope at all gracefully with grammatically incorrect
material, and much of what's written on this list is.

Anyway, the code is yours if you want it. It's mostly C, but
with some script or batch files to glue together the separate 
components.


cdw
-- 
\\// ze'uku ko jmive gi'e snada

