From nobody@digitalkingdom.org Tue Feb 20 13:53:56 2007 Received: with ECARTIS (v1.0.0; list lojban-list); Tue, 20 Feb 2007 13:53:57 -0800 (PST) Received: from nobody by chain.digitalkingdom.org with local (Exim 4.63) (envelope-from ) id 1HJcvX-00078X-Ua for lojban-list-real@lojban.org; Tue, 20 Feb 2007 13:53:36 -0800 Received: from nf-out-0910.google.com ([64.233.182.186]) by chain.digitalkingdom.org with esmtp (Exim 4.63) (envelope-from ) id 1HJcvS-00078P-Mc for lojban-list@lojban.org; Tue, 20 Feb 2007 13:53:35 -0800 Received: by nf-out-0910.google.com with SMTP id c31so22780nfb for ; Tue, 20 Feb 2007 13:53:28 -0800 (PST) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=UQOy0FTlkQ9Lim7xKBKm8rM4bcELq1RTDO48ZOgGts/BrCGCR0kiB2xDSsXL5vQ2hqtL3Dluf9sssganu9RyTg1oiodp24jqqY2NOKQo6RU3ElGVAFdiHnkq6FE+KaFMpezve967R1LdaDQz+GRbEP9shNdoZSkpON1+b8hbYDI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=llNXUkSTZ80cxVqv4OSU1BNyew4I8r55a2z4EP03ovlduIlL2wTSk1M9iv4D2+MOBzdtsck0U/4mil4uq9ck0Jbkjyt+ghTn82NbrWxjiasSafiu7z3odSBI9d06MrT814bG5QShRf11R2X5Vc7f20fK2j3FoAhvGAnNpnLNM38= Received: by 10.49.41.12 with SMTP id t12mr6793nfj.1172008408823; Tue, 20 Feb 2007 13:53:28 -0800 (PST) Received: by 10.49.9.8 with HTTP; Tue, 20 Feb 2007 13:53:28 -0800 (PST) Message-ID: <925d17560702201353r2c161813pacb95207416be530@mail.gmail.com> Date: Tue, 20 Feb 2007 18:53:28 -0300 From: "=?ISO-8859-1?Q?Jorge_Llamb=EDas?=" To: lojban-list@lojban.org Subject: [lojban] Re: morphology paper announced In-Reply-To: <3ccac5f10702200925r5672d9f7j23346557ff50888d@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <3ccac5f10702200925r5672d9f7j23346557ff50888d@mail.gmail.com> X-Spam-Score: -2.5 X-Spam-Score-Int: -24 X-Spam-Bar: -- X-archive-position: 13588 X-ecartis-version: Ecartis v1.0.0 Sender: lojban-list-bounce@lojban.org Errors-to: lojban-list-bounce@lojban.org X-original-sender: jjllambias@gmail.com Precedence: bulk Reply-to: lojban-list@lojban.org X-list: lojban-list On 2/20/07, Cyril Slobin wrote: > > http://www.lojban.org/tiki/tiki-index.php?page=Morphology+analysis+programs+comparasion&bl > > Comments are solicited. coi kir Have you looked at ? << 2.2.2 Leading cmavo There is no common agreement about breaking a potential brivla into leading cmavo and the rest. Published word breaking algorithm gives a set of patterns for breaking words, but it is unclear whether the rest of the word after cutting off a leading cmavo must be a valid word by itself. >> Yes, the rest must be one or more words, otherwise you cannot separate a cmavo. << Brkwords program treats this as the fact of being a valid word for the resting part is irrelevant: for example, the word "iglu" breaks into cmavo "i" plus resting "glu" and therefore is not a valid fu'ivla (the fact that "glu" is not a valid word by itself is irrelevant). On the other hand, vlatai insists that "iglu" is valid word, *because* "glu" is not a valid word and therefore "iglu" is not breakable. The Vim syntax plugin follows the first approach (brkwords compatible) by default, but can be coerced into vlatai-compatible mode by setting a flag variable. >> {.iglu} is no different from {ciblu}. If you break it into {.i} + {glu} then you would also break {ciblu} into {ci} + {blu}. << 2.2.3 Obscure case Vlatai does not recognize as brivla some words that I failed to find any reason not to be a valid brivla. The shortest possible example is "adjdga". If someone knows why this is not a brivla, mail me please! For the Vim syntax plugin this word is a valid brivla. >> The PEG morphology rejects it because "jdg" is not a valid initial cluster. It only accepts non-initial clusters that consist of one consonant plus a valid initial cluster. mu'o mi'e xorxes To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.