From sentto-44114-18390-1043461300-lojban-in=lojban.org@returns.groups.yahoo.com Fri Jan 24 18:22:17 2003 Received: with ECARTIS (v1.0.0; list lojban-list); Fri, 24 Jan 2003 18:22:17 -0800 (PST) Received: from n5.grp.scd.yahoo.com ([66.218.66.89]) by digitalkingdom.org with smtp (Exim 4.05) id 18cFxT-0007mA-01 for lojban-in@lojban.org; Fri, 24 Jan 2003 18:22:11 -0800 X-eGroups-Return: sentto-44114-18390-1043461300-lojban-in=lojban.org@returns.groups.yahoo.com Received: from [66.218.67.196] by n5.grp.scd.yahoo.com with NNFMP; 25 Jan 2003 02:21:40 -0000 X-Sender: phma@ixazon.dynip.com X-Apparently-To: lojban@yahoogroups.com Received: (EGP: mail-8_2_3_0); 25 Jan 2003 02:21:40 -0000 Received: (qmail 82396 invoked from network); 25 Jan 2003 02:21:39 -0000 Received: from unknown (66.218.66.218) by m3.grp.scd.yahoo.com with QMQP; 25 Jan 2003 02:21:39 -0000 Received: from unknown (HELO blackcat.ixazon.lan) (208.150.110.21) by mta3.grp.scd.yahoo.com with SMTP; 25 Jan 2003 02:21:39 -0000 Received: by blackcat.ixazon.lan (Postfix, from userid 1001) id 90E87A5AC; Sat, 25 Jan 2003 02:21:37 +0000 (UTC) Organization: dis To: lojban@yahoogroups.com User-Agent: KMail/1.5 References: <5.2.0.9.0.20030124074752.0360aec0@pop.east.cox.net> <5.2.0.9.0.20030124202537.03d9ab60@pop.east.cox.net> In-Reply-To: <5.2.0.9.0.20030124202537.03d9ab60@pop.east.cox.net> Message-Id: <200301242121.36960.phma@webjockey.net> From: Pierre Abbat MIME-Version: 1.0 Mailing-List: list lojban@yahoogroups.com; contact lojban-owner@yahoogroups.com Delivered-To: mailing list lojban@yahoogroups.com Precedence: bulk Date: Fri, 24 Jan 2003 21:21:36 -0500 Subject: [lojban] Re: valfendi algorithm Content-Type: text/plain; charset=US-ASCII X-archive-position: 3903 X-ecartis-version: Ecartis v1.0.0 Sender: lojban-list-bounce@lojban.org Errors-to: lojban-list-bounce@lojban.org X-original-sender: phma@webjockey.net Precedence: bulk Reply-to: lojban-list@lojban.org X-list: lojban-list On Friday 24 January 2003 20:31, Robert LeChevalier wrote: > My understanding of: > >A slinku'i, as far as word breaking is concerned, is anything that matches > >the following regex: > >^C[raf3]*([gim]?$|[raf4]?y) > >where > >C matches any consonant > >[raf3] matches any 3-letter rafsi > >[raf4] matches any 4-letter rafsi > >[gim] matches any gismu. > > A correct algorithm would use the structures CVC/CVV/CCV for raf3, > CVCC/CCVC for raf4 and CVCCV/CCVCV for gim. It doesn't matter whether the > values are in fact actually used. Post-freeze it seems logical that it > would and should be easier to add and subtract from the gismu/rafsi lists > than to change the entire morphology, so the morphology is defined at a > higher level than the specific list of words. The program matches the structures, not a list of words, and I meant the algorithm to do so also. If the algorithm is unclear, check the program. If they disagree, tell me. I will use a list of words when I write the part that analyzes a lujvo into rafsi and looks them up; if a rafsi is not in the list it will say "?", e.g. {zbekyxoxmau} will be analyzed as {zbek? ? zmadu}. > (In addition "ala'um" is not an "option"; there should be no options in an > official algorithm. It is either valid or invalid according to the rules.) The Book is gricingly unclear about this detail: Names are not permitted to have the sequences ``la'', ``lai'', or ``doi'' embedded in them, unless the sequence is immediately preceded by a consonant. Since anything that contains the sequence "lai" contains the sequence "la", and following "la" or "lai" with a vowel makes it unbreakable just as preceding it with a consonant does, I griced it to mean "...preceded by a consonant or followed by a vowel". But if that were the case, why isn't "la'i" mentioned? A few lines later it says "No cmene may have the syllables ``la'', ``lai'', or ``doi'' in them, unless immediately preceded by a consonant." In {laus}, "la" is a sequence, but not a syllable. In {la,us}, it is both a sequence and a syllable. But the presence or absence of commas in a word makes no difference to the identity or validity of the word. So is that valid or not? {laus} cannot be broken into {la ,us}, nor {ala'um} into {a la 'um}, because a word cannot begin with an apostrophe or with a pauseless vowel. phma To unsubscribe, send mail to lojban-unsubscribe@onelist.com Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/