[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [lojban] morphology paper announced
- To: lojban-list@lojban.org
- Subject: Re: [lojban] morphology paper announced
- From: "Jorge Llambías" <jjllambias@gmail.com>
- Date: Tue, 20 Feb 2007 18:53:28 -0300
- Dkim-signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=UQOy0FTlkQ9Lim7xKBKm8rM4bcELq1RTDO48ZOgGts/BrCGCR0kiB2xDSsXL5vQ2hqtL3Dluf9sssganu9RyTg1oiodp24jqqY2NOKQo6RU3ElGVAFdiHnkq6FE+KaFMpezve967R1LdaDQz+GRbEP9shNdoZSkpON1+b8hbYDI=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=llNXUkSTZ80cxVqv4OSU1BNyew4I8r55a2z4EP03ovlduIlL2wTSk1M9iv4D2+MOBzdtsck0U/4mil4uq9ck0Jbkjyt+ghTn82NbrWxjiasSafiu7z3odSBI9d06MrT814bG5QShRf11R2X5Vc7f20fK2j3FoAhvGAnNpnLNM38=
- In-reply-to: <3ccac5f10702200925r5672d9f7j23346557ff50888d@mail.gmail.com>
- References: <3ccac5f10702200925r5672d9f7j23346557ff50888d@mail.gmail.com>
On 2/20/07, Cyril Slobin <slobin@ice.ru> wrote:
http://www.lojban.org/tiki/tiki-index.php?page=Morphology+analysis+programs+comparasion&bl
Comments are solicited.
coi kir
Have you looked at
<http://www.lojban.org/tiki/tiki-index.php?page=BPFK%20Section%3A%20PEG%20Morphology%20Algorithm>
?
<<
2.2.2 Leading cmavo
There is no common agreement about breaking a potential brivla into
leading cmavo and the rest. Published word breaking algorithm gives
a set of patterns for breaking words, but it is unclear whether the
rest of the word after cutting off a leading cmavo must be a valid
word by itself.
Yes, the rest must be one or more words, otherwise you cannot separate
a cmavo.
<<
Brkwords program treats this as the fact of being
a valid word for the resting part is irrelevant: for example, the word
"iglu" breaks into cmavo "i" plus resting "glu" and therefore is not
a valid fu'ivla (the fact that "glu" is not a valid word by itself is
irrelevant). On the other hand, vlatai insists that "iglu" is valid
word, *because* "glu" is not a valid word and therefore "iglu" is not
breakable. The Vim syntax plugin follows the first approach (brkwords
compatible) by default, but can be coerced into vlatai-compatible mode
by setting a flag variable.
{.iglu} is no different from {ciblu}. If you break it into {.i} + {glu}
then you would also break {ciblu} into {ci} + {blu}.
<<
2.2.3 Obscure case
Vlatai does not recognize as brivla some words that I failed to find
any reason not to be a valid brivla. The shortest possible example is
"adjdga". If someone knows why this is not a brivla, mail me please!
For the Vim syntax plugin this word is a valid brivla.
The PEG morphology rejects it because "jdg" is not a valid initial cluster.
It only accepts non-initial clusters that consist of one consonant plus
a valid initial cluster.
mu'o mi'e xorxes