From sentto-44114-18390-1043461300-lojban-in=lojban.org@returns.groups.yahoo.com Fri Jan 24 18:22:17 2003
Received: with ECARTIS (v1.0.0; list lojban-list); Fri, 24 Jan 2003 18:22:17 -0800 (PST)
Received: from n5.grp.scd.yahoo.com ([66.218.66.89])
	by digitalkingdom.org with smtp (Exim 4.05)
	id 18cFxT-0007mA-01
	for lojban-in@lojban.org; Fri, 24 Jan 2003 18:22:11 -0800
X-eGroups-Return: sentto-44114-18390-1043461300-lojban-in=lojban.org@returns.groups.yahoo.com
Received: from [66.218.67.196] by n5.grp.scd.yahoo.com with NNFMP; 25 Jan 2003 02:21:40 -0000
X-Sender: phma@ixazon.dynip.com
X-Apparently-To: lojban@yahoogroups.com
Received: (EGP: mail-8_2_3_0); 25 Jan 2003 02:21:40 -0000
Received: (qmail 82396 invoked from network); 25 Jan 2003 02:21:39 -0000
Received: from unknown (66.218.66.218)
  by m3.grp.scd.yahoo.com with QMQP; 25 Jan 2003 02:21:39 -0000
Received: from unknown (HELO blackcat.ixazon.lan) (208.150.110.21)
  by mta3.grp.scd.yahoo.com with SMTP; 25 Jan 2003 02:21:39 -0000
Received: by blackcat.ixazon.lan (Postfix, from userid 1001)
	id 90E87A5AC; Sat, 25 Jan 2003 02:21:37 +0000 (UTC)
Organization: dis
To: lojban@yahoogroups.com
User-Agent: KMail/1.5
References: <5.2.0.9.0.20030124074752.0360aec0@pop.east.cox.net> <5.2.0.9.0.20030124202537.03d9ab60@pop.east.cox.net>
In-Reply-To: <5.2.0.9.0.20030124202537.03d9ab60@pop.east.cox.net>
Message-Id: <200301242121.36960.phma@webjockey.net>
From: Pierre Abbat <phma@webjockey.net>
MIME-Version: 1.0
Mailing-List: list lojban@yahoogroups.com; contact lojban-owner@yahoogroups.com
Delivered-To: mailing list lojban@yahoogroups.com
Precedence: bulk
Date: Fri, 24 Jan 2003 21:21:36 -0500
Subject: [lojban] Re: valfendi algorithm
Content-Type: text/plain; charset=US-ASCII
X-archive-position: 3903
X-ecartis-version: Ecartis v1.0.0
Sender: lojban-list-bounce@lojban.org
Errors-to: lojban-list-bounce@lojban.org
X-original-sender: phma@webjockey.net
Precedence: bulk
Reply-to: lojban-list@lojban.org
X-list: lojban-list

On Friday 24 January 2003 20:31, Robert LeChevalier wrote:
> My understanding of:
> >A slinku'i, as far as word breaking is concerned, is anything that matches
> >the following regex:
> >^C[raf3]*([gim]?$|[raf4]?y)
> >where
> >C matches any consonant
> >[raf3] matches any 3-letter rafsi
> >[raf4] matches any 4-letter rafsi
> >[gim] matches any gismu.
>
> A correct algorithm would use the structures CVC/CVV/CCV for raf3,
> CVCC/CCVC for raf4 and CVCCV/CCVCV for gim.  It doesn't matter whether the
> values are in fact actually used.  Post-freeze it seems logical that it
> would and should be easier to add and subtract from the gismu/rafsi lists
> than to change the entire morphology, so the morphology is defined at a
> higher level than the specific list of words.

The program matches the structures, not a list of words, and I meant the 
algorithm to do so also. If the algorithm is unclear, check the program. If 
they disagree, tell me. I will use a list of words when I write the part that 
analyzes a lujvo into rafsi and looks them up; if a rafsi is not in the list 
it will say "?", e.g. {zbekyxoxmau} will be analyzed as {zbek? ? zmadu}.

> (In addition "ala'um" is not an "option"; there should be no options in an
> official algorithm.  It is either valid or invalid according to the rules.)

The Book is gricingly unclear about this detail:

 Names are not permitted to have the sequences ``la'', ``lai'', or ``doi'' 
embedded in them, unless the sequence is immediately preceded by a consonant.

Since anything that contains the sequence "lai" contains the sequence "la", 
and following "la" or "lai" with a vowel makes it unbreakable just as 
preceding it with a consonant does, I griced it to mean "...preceded by a 
consonant or followed by a vowel". But if that were the case, why isn't 
"la'i" mentioned? A few lines later it says "No cmene may have the syllables 
``la'', ``lai'', or ``doi'' in them, unless immediately preceded by a 
consonant." In {laus}, "la" is a sequence, but not a syllable. In {la,us}, it 
is both a sequence and a syllable. But the presence or absence of commas in a 
word makes no difference to the identity or validity of the word. So is that 
valid or not? {laus} cannot be broken into {la ,us}, nor {ala'um} into {a la 
'um}, because a word cannot begin with an apostrophe or with a pauseless 
vowel.

phma

To unsubscribe, send mail to lojban-unsubscribe@onelist.com 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/