From lojban+bncCLr6ktCfBBCulLLpBBoExyNZXg@googlegroups.com Tue Jan 11 08:50:10 2011 Received: from mail-gy0-f189.google.com ([209.85.160.189]) by chain.digitalkingdom.org with esmtp (Exim 4.72) (envelope-from ) id 1PchQ0-0007W0-3D; Tue, 11 Jan 2011 08:50:10 -0800 Received: by gyb11 with SMTP id 11sf19169445gyb.16 for ; Tue, 11 Jan 2011 08:49:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=beta; h=domainkey-signature:received:x-beenthere:received:received:received :received:received-spf:received:received:received:date:from:to :subject:message-id:mail-followup-to:mime-version:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:sender:list-subscribe :list-unsubscribe:content-type:content-disposition; bh=5vpx4JcWncXW9AKFGHStkTBmQEOrmM8QQ7aAmYwcPhI=; b=eXZLDGTXkwYhDgdqrSk4wf4Z+fTJQ7wSue5gZDA+7ksOidfcBAnX/qeOh4HDal2Oqv Nk2fWU3hXSUShfS4o83J5D+Ml5h1wfewTdrOYmj9SjSAWGGpJYm+pu6IuzOGwLRE64io uu+tL5pGEcs7+1N3VxLrI4o3nKi6F3W3l0mhg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlegroups.com; s=beta; h=x-beenthere:received-spf:date:from:to:subject:message-id :mail-followup-to:mime-version:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:sender:list-subscribe :list-unsubscribe:content-type:content-disposition; b=Dndw1GxqxUCVHLoFJCQPsBhlVW0eOzpp5gfwqbKEhzobc2+zGG/dF2yjbvqGF6f3j2 nmG5BqDYOw6vmBdoTJ3yUzaY1Hk1hjGcXNLz1TH/4vMZlvjHNgvSs/HmPOF8U5MwLLj+ NSSiddbWP8YSppt9oJIBnS5o9fSDms6H0QeT4= Received: by 10.100.133.17 with SMTP id g17mr962429and.9.1294764590243; Tue, 11 Jan 2011 08:49:50 -0800 (PST) X-BeenThere: lojban@googlegroups.com Received: by 10.100.4.7 with SMTP id 7ls1181267and.7.p; Tue, 11 Jan 2011 08:49:49 -0800 (PST) Received: by 10.101.70.15 with SMTP id x15mr393411ank.45.1294764589656; Tue, 11 Jan 2011 08:49:49 -0800 (PST) Received: by 10.101.70.15 with SMTP id x15mr393410ank.45.1294764589637; Tue, 11 Jan 2011 08:49:49 -0800 (PST) Received: from mail-yx0-f177.google.com (mail-yx0-f177.google.com [209.85.213.177]) by gmr-mx.google.com with ESMTP id a26si12780393ana.5.2011.01.11.08.49.49; Tue, 11 Jan 2011 08:49:49 -0800 (PST) Received-SPF: neutral (google.com: 209.85.213.177 is neither permitted nor denied by best guess record for domain of alanpost@sunflowerriver.org) client-ip=209.85.213.177; Received: by yxm34 with SMTP id 34so8013048yxm.8 for ; Tue, 11 Jan 2011 08:49:49 -0800 (PST) Received: by 10.90.87.8 with SMTP id k8mr377750agb.6.1294764589238; Tue, 11 Jan 2011 08:49:49 -0800 (PST) Received: from sunflowerriver.org (173-10-243-253-Albuquerque.hfc.comcastbusiness.net [173.10.243.253]) by mx.google.com with ESMTPS id f10sm39082757anh.5.2011.01.11.08.49.39 (version=TLSv1/SSLv3 cipher=RC4-MD5); Tue, 11 Jan 2011 08:49:44 -0800 (PST) Date: Tue, 11 Jan 2011 09:49:36 -0700 From: ".alyn.post." To: Lojban List Subject: [lojban] compound cmavo classification in cmavo.txt Message-ID: <20110111164936.GC38541@alice.local> Mail-Followup-To: Lojban List Mime-Version: 1.0 X-Original-Sender: alyn.post@lodockikumazvati.org X-Original-Authentication-Results: gmr-mx.google.com; spf=neutral (google.com: 209.85.213.177 is neither permitted nor denied by best guess record for domain of alanpost@sunflowerriver.org) smtp.mail=alanpost@sunflowerriver.org Reply-To: lojban@googlegroups.com Precedence: list Mailing-list: list lojban@googlegroups.com; contact lojban+owners@googlegroups.com List-ID: List-Post: , List-Help: , List-Archive: Sender: lojban@googlegroups.com List-Subscribe: , List-Unsubscribe: , Content-Type: text/plain; charset=ISO-8859-1 Content-Disposition: inline The following file: http://www.lojban.org/publications/wordlists/cmavo.txt Is a list of cmavo. I believe it is the canonical list, please correct me if that is a misunderstanding. This file includes compound cmavo, like "le go'i", but only includes a single selma'o class, even when the compound cmavo consists of cmavo in more than one selma'o. I've loaded all of the entries in cmavo.txt into the parser and categorized the results I get when comparing what the parser says to what cmavo.txt says. Most entries make sense, but some don't. Here are the patterns from the parser -> cmavo.txt. cmavo.txt has only a single entry, whereas the parser is classifying individual cmavo. '?' is a free variable, and is equal on both sides of the production. That means that for single cmavo the production '? -> ?' should (and does) hold: the parser is consistent with the cmavo.txt file. parser -> cmavo.txt ? BU -> BY ; letteral conversion, an artifact of my parser. FEhE ? -> ? ; with FEhE, cmavo.txt uses second selma'o. FEhE PA ? -> ? I ? -> ? ; I prefix is ignored. I NA ? -> ? ; and so is negation. I ? NAI -> ? JAI VA -> SE ; Why? JAI BAI -> SE ; Why? JAI PU -> SE ; Why? LAhE ? -> ? ; cmavo.txt uses the second selma'o here. LE GOhA -> KOhA ; Why? LE SE GOhA -> KOhA ; Why? MOhI ? -> ? ; cmavo.txt uses the second selma'o here. NA ? -> ? ; ignore negation. NAhE ? -> ? ; ignore negation. PA+ ? -> ? ; ignore quantifier. PU ZAhO -> ZAhO ; "PU ZAhO" is ZAhO, "PU !ZAhO ?" is PU. SE ? -> ? ; ignore conversion prefix SE ? KOhA -> ? ; ignore conversion prefix SE ? NAI -> ? ; ignore conversion prefix, negation. ? _ _ -> ? ; everything else matches the first selma'o. ? _ -> ? ; everything else matches the first selma'o. ? -> ? ; if there is only one cmavo, we're consistent. I particularly question these productions: JAI VA -> SE JAI BAI -> SE JAI PU -> SE LE GOhA -> KOhA LE SE GOhA -> KOhA Because I don't think there is any grammatical way in that compound cmavo become a differente, single cmavo, save for BU converting it and it's prefix into BY. I believe these conversions are grammatically equivalent (I haven't confirmed them all), but that doesn't change their selma'o class, does it? Is cmavo.txt in error here? I also wonder about the consequence of this pattern: PU ZAhO -> ZAhO Because it is the only PU-prefixed class that behaves this way, the other compound cmavo being in selma'o PU. I can provide the actual lines in cmavo.txt for these patterns, please ask. I think looking at the overall classification is a better demonstration of the question, though some of these categories have only a single entry in cmavo.txt, and any actual errors need to be confirmed case-by-case. -Alan PS: The source code for which this e-mail is based on can be found here: http://bugs.call-cc.org/browser/release/4/jbogenturfahi/trunk/tests/cmavo.scm -- .i ko djuno fi le do sevzi -- You received this message because you are subscribed to the Google Groups "lojban" group. To post to this group, send email to lojban@googlegroups.com. To unsubscribe from this group, send email to lojban+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/lojban?hl=en.