From nobody@digitalkingdom.org Thu Feb 22 07:40:27 2007 Received: with ECARTIS (v1.0.0; list lojban-list); Thu, 22 Feb 2007 07:40:28 -0800 (PST) Received: from nobody by chain.digitalkingdom.org with local (Exim 4.63) (envelope-from ) id 1HKG3D-0005dq-HZ for lojban-list-real@lojban.org; Thu, 22 Feb 2007 07:40:08 -0800 Received: from wx-out-0506.google.com ([66.249.82.232]) by chain.digitalkingdom.org with esmtp (Exim 4.63) (envelope-from ) id 1HKG33-0005dO-O0 for lojban-list@lojban.org; Thu, 22 Feb 2007 07:40:05 -0800 Received: by wx-out-0506.google.com with SMTP id i30so302808wxd for ; Thu, 22 Feb 2007 07:39:56 -0800 (PST) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:sender:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references:x-google-sender-auth; b=As5GoWlPKx92eCCubJ1Q73vCyMTehbUUmFdj9zdFOp1Yt4+NxVfDRcNGfkPBsNI5aX80ZPo68IX/Ti5B/rACFGrj1aTz0QLaeAjnm1SAZi9buS8vG3p/SNC6uhshz3tJhsS2VVfMFC00jibVdvLFB4e6Foko1rA8/DDBDPG/1KU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:sender:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references:x-google-sender-auth; b=YFeH7+oqrJQcoUshrK+eCn0cJAKWT8eHQ2dHNCs3Y0/+uq50bF8cVyuYPHfDKZZ/BrXi+Vg5KcdI+ifM6EDO8nZTUYqpFLe1pmksFhQ+V/txKJfmS9jx3MLultyqjgu3RJjyHizAW9capT8gCgzHHTzPo9P/AI2gNrnnze23su4= Received: by 10.90.63.16 with SMTP id l16mr566623aga.1172158796463; Thu, 22 Feb 2007 07:39:56 -0800 (PST) Received: by 10.90.31.6 with HTTP; Thu, 22 Feb 2007 07:39:56 -0800 (PST) Message-ID: <3ccac5f10702220739l87181edqc48c4cfbece4fbd2@mail.gmail.com> Date: Thu, 22 Feb 2007 18:39:56 +0300 From: "Cyril Slobin" To: lojban-list@lojban.org Subject: [lojban] Re: morphology paper announced In-Reply-To: <925d17560702211205o1c221f0aua725d71b27f1eaa4@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <3ccac5f10702200925r5672d9f7j23346557ff50888d@mail.gmail.com> <925d17560702201353r2c161813pacb95207416be530@mail.gmail.com> <3ccac5f10702210930j585f1ef8ydd0b7076cd9c4ba9@mail.gmail.com> <925d17560702211205o1c221f0aua725d71b27f1eaa4@mail.gmail.com> X-Google-Sender-Auth: 5670b290b5186a4c X-Spam-Score: -2.6 X-Spam-Score-Int: -25 X-Spam-Bar: -- X-archive-position: 13596 X-ecartis-version: Ecartis v1.0.0 Sender: lojban-list-bounce@lojban.org Errors-to: lojban-list-bounce@lojban.org X-original-sender: slobin@ice.ru Precedence: bulk Reply-to: lojban-list@lojban.org X-list: lojban-list coi xorxes > the idea of fu'ivla is that any string that follows the phonotactics > of brivla and cannot be confused with cmavo and/or lujvo is a valid > fu'ivla form. If the brkwords algorithm makes {.iglu} invalid, then > there is a problem with the algorithm, not with {.iglu}, which cannot > break into cmavo and/or lujvo. I think the idea behind is that recognition algorithm must be as easy as possible and must not require a complex analysis. You argue about {iglu} because you SEE that {glu} is not a brivla. Nobody mistakes one-syllable piece of text for a word. What about {adjgadja}? Do you still SEE that {djgadja} is not a brivla? You suggestion is to cut off an initial cmavo and then recursively apply word recognition algorithm to the rest. (Well, I'm lying. Not recursively - some simplified version of the algorithm is enough. In particular, the procedure in question - cut off cmavo and test the rest - doesn't apply, so there is no recursion. Nevertheless the applying part is still complex enough and includes such non-obvious beasts as slinku'i test). On the other hand, my approach is to apply a short sequence of simple patterns (specified in brkwords). Just for an illustration: in my program my approach is implemented in two lines of code, while your requires six (and four more auxiliary, but it is rather poor design than task requirement, so I don't count them). > > There are only two potentially infinite initial cluster forms: > > (d[jz])+ and (t[cs])+, but who have decided to forbid them? > > I did, on the grounds that it makes no sense to allow inintial > "tctctctc" while disallowing "sb". I think disallowing the affricates > from participating in further clusters makes sense. Oh, I'm just whining... I've spent hours writing down explicit syntax for initial cluster, and you suggest me throw it out. Cruel, cruel you! > Since the onset cannot be empty, Oops! I've missed this. Thank you! -- Cyril Slobin `When I use a word,' Humpty Dumpty said, `it means just what I choose it to mean' To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.