From nobody@digitalkingdom.org Thu Feb 22 08:12:09 2007 Received: with ECARTIS (v1.0.0; list lojban-list); Thu, 22 Feb 2007 08:12:09 -0800 (PST) Received: from nobody by chain.digitalkingdom.org with local (Exim 4.63) (envelope-from ) id 1HKGXt-0006a3-B3 for lojban-list-real@lojban.org; Thu, 22 Feb 2007 08:11:49 -0800 Received: from nf-out-0910.google.com ([64.233.182.188]) by chain.digitalkingdom.org with esmtp (Exim 4.63) (envelope-from ) id 1HKGXl-0006Zu-Pi for lojban-list@lojban.org; Thu, 22 Feb 2007 08:11:49 -0800 Received: by nf-out-0910.google.com with SMTP id c31so1043248nfb for ; Thu, 22 Feb 2007 08:11:36 -0800 (PST) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=OcDrf8xPDp7JAdCyNOL2Ra+s8Bziw0+28MgTaolb2sUYst7woHtLFLN4lCJUrePkUADfzyqVpfZI6wmRvor6brg+cwM6Zae3xtC2pWC+OhwzbgPdoAmZBESkiNKhkoLvLAFIjpv71gz0bjflDxI4D4LNcZbV+M3K8If6HpUtd3Y= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=DanvXEvij6Px1ulzu9x6KPLTsXXCKCD9/0kpsRm12pOEyOoQ25abwdcUQZ7ySjMgxPCZz9hozpq6YsSTwF3pPnANQHQo443alhq8gEJwaaFCWWYRGk+tBpmAgDG6mSfxitoC6KhqV9hEfcGLCEuqV6VXHcWOwj+MgL1qbuxIatk= Received: by 10.49.57.1 with SMTP id j1mr3873246nfk.1172160290165; Thu, 22 Feb 2007 08:04:50 -0800 (PST) Received: by 10.49.9.8 with HTTP; Thu, 22 Feb 2007 08:04:50 -0800 (PST) Message-ID: <925d17560702220804v251e3674r47977af199041140@mail.gmail.com> Date: Thu, 22 Feb 2007 13:04:50 -0300 From: "=?ISO-8859-1?Q?Jorge_Llamb=EDas?=" To: lojban-list@lojban.org Subject: [lojban] Re: morphology paper announced In-Reply-To: <3ccac5f10702220739l87181edqc48c4cfbece4fbd2@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <3ccac5f10702200925r5672d9f7j23346557ff50888d@mail.gmail.com> <925d17560702201353r2c161813pacb95207416be530@mail.gmail.com> <3ccac5f10702210930j585f1ef8ydd0b7076cd9c4ba9@mail.gmail.com> <925d17560702211205o1c221f0aua725d71b27f1eaa4@mail.gmail.com> <3ccac5f10702220739l87181edqc48c4cfbece4fbd2@mail.gmail.com> X-Spam-Score: -2.5 X-Spam-Score-Int: -24 X-Spam-Bar: -- X-archive-position: 13597 X-ecartis-version: Ecartis v1.0.0 Sender: lojban-list-bounce@lojban.org Errors-to: lojban-list-bounce@lojban.org X-original-sender: jjllambias@gmail.com Precedence: bulk Reply-to: lojban-list@lojban.org X-list: lojban-list On 2/22/07, Cyril Slobin wrote: > > I think the idea behind is that recognition algorithm must be as easy as > possible and must not require a complex analysis. That would be wonderful. Unfortunately, the tosmabru and slinku'i tests make this almost impossible. You can't recognize tosmabru and slinku'i failure without some relatively complex analysis. > You argue about {iglu} > because you SEE that {glu} is not a brivla. If you think of {.} as an ordinary consonant, {.iglu} is just like a gismu. > Nobody mistakes one-syllable > piece of text for a word. What about {adjgadja}? Do you still SEE that > {djgadja} is not a brivla? Yes, immediately, because {djg} for me is not a valid initial cluster. But let's consider {.asprapra} instead. This is harder to figure out, because {spr} is a valid initial, but then {sprapra} fails slinku'i. If you think of {.} as an ordinary consonant, {.asprapra} is just like a lujvo {pasprapra}, and {.adjgadja} is just like a lujvo {padjgadja}. Detecting that the {.a} in {.asprapra} doesn't break off is no harder than detecting that the {pa} in {pasprapra} doesn't break off. > You suggestion is to cut off an initial cmavo > and then recursively apply word recognition algorithm to the rest. That's unavoidable. How else do you detect tosmabru failure? > On the other hand, my approach is > to apply a short sequence of simple patterns (specified in brkwords). I haven't studied brkwords very carefully, but as I pointed out, brkwords says the string {.iglu} contains a brivla, doesn't it? mu'o mi'e xorxes To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.