Received: from spooler by stryx.demon.co.uk (Mercury/32 v2.01); 21 Aug 98 23:24:53 +0000 Return-path: Received: from punt-21.mail.demon.net (194.217.242.6) by stryx.demon.co.uk (Mercury/32 v2.01); 21 Aug 98 23:24:46 +0000 Received: from punt-2.mail.demon.net by mailstore for ia@stryx.demon.co.uk id 903663146:20:05669:0; Fri, 21 Aug 98 01:32:26 GMT Received: from listserv.cuny.edu ([128.228.100.10]) by punt-2.mail.demon.net id aa2005587; 21 Aug 98 1:32 GMT Received: from listserv (listserv.cuny.edu) by listserv.cuny.edu (LSMTP for Windows NT v1.1b) with SMTP id <3.FFA3149B@listserv.cuny.edu>; Thu, 20 Aug 1998 21:33:11 -0400 Date: Fri, 21 Aug 1998 02:29:34 +0000 Reply-To: george.foot@merton.oxford.ac.uk Sender: Lojban list Comments: Authenticated sender is From: George Foot Subject: Re: lujvo ... X-To: C.D.Wright@SOLIPSYS.COMPULINK.CO.UK X-cc: lojban@cuvmb.cc.columbia.edu To: Multiple recipients of list LOJBAN Message-ID: <903663136.205587.0@listserv.cuny.edu> X-PMFLAGS: 33554560 7 1 Y029D1.CNM Content-Length: 2073 Lines: 53 On 20 Aug 98 at 22:39, C.D. Wright wrote: > Question ... > > I vaguely know how to construct lujvo, and I vaguely > know about CCV, CVV, etc., rules. What I was wondering > was - is there a simple algorithm that can break apart > a lujvo into its consituent pieces? Yes, it's not too difficult. I made a program to do this by recursively matching the first characters of right-substrings of the lujvo with rafsi masks, and performing a few checks on the results. It's not a perfect filter, i.e. it lets through some non-lujvo, but if you type a valid lujvo it will split it into its component rafsi. Examples: >jvokatna bralo\'i bra-lo'i >jvokatna soirsai soi-r-sai As you can see, it understands hyphens. But it would also have read "soisai" as "soi-sai", when in fact it cannot be a lujvo because it doesn't have a consonant cluster in the first five letters. My program does a few checks, testing valid consonant clusters and vowel clusters for example, but it wasn't really designed to verify lujvo, only to help in machine-translating them. In fact I think it would be very awkward to write a lujvo verifier, since (if I understand correctly) the only valid lujvo are those produced by the algorithm in the reference grammar -- at least, in many places it is stated that you must do exactly what the algorithm tells you. In case you're interested in the above program, I uploaded it. You can download the C source (written using djgpp, but should be fairly portable), which is 5k, or a DOS executable (the C source compiled with djgpp), which is 20k. http://users.ox.ac.uk/~mert0407/jvokatna.c http://users.ox.ac.uk/~mert0407/jvokatna.exe -- george.foot@merton.oxford.ac.uk