From lojban+bncCLi-v_X3FRDshc7eBBoEsyWHEg@googlegroups.com Sat Apr 24 16:47:16 2010 Received: from mail-px0-f189.google.com ([209.85.212.189]) by chain.digitalkingdom.org with esmtp (Exim 4.71) (envelope-from ) id 1O5p46-0004CS-3n; Sat, 24 Apr 2010 16:47:16 -0700 Received: by pxi11 with SMTP id 11sf780214pxi.16 for ; Sat, 24 Apr 2010 16:47:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=beta; h=domainkey-signature:received:x-beenthere:received:received:received :received:received:received-spf:received:received:date:from:to :subject:message-id:mail-followup-to:references:mime-version :in-reply-to:organization:user-agent :x-original-authentication-results:x-original-sender:reply-to :precedence:mailing-list:list-id:list-post:list-help:list-archive :sender:list-subscribe:list-unsubscribe:content-type :content-disposition; bh=/PVDfMm72bADce9vzJIDVp18Z2P5OmdosyFydGpOiMI=; b=475iuSEuvRaLdKM6zowBoWeqCK7WrnIAer0ZPgIoVcCnZuKGV4GvQLQJ6u3QCVJ3lr 3heZP5EJGqWnGC2JddzIsCq2x5iAqCuF3fxN8Xuar1puekeGUzsZxFey51jT6IaVdWAY ekyfcQ+L1hZdd1ACOKyiADwna0yZ9Qp43MoAk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlegroups.com; s=beta; h=x-beenthere:received-spf:date:from:to:subject:message-id :mail-followup-to:references:mime-version:in-reply-to:organization :user-agent:x-original-authentication-results:x-original-sender :reply-to:precedence:mailing-list:list-id:list-post:list-help :list-archive:sender:list-subscribe:list-unsubscribe:content-type :content-disposition; b=LqxcHWMKefX6BAvpB9S1LMJM6oN7rN4CaM8bKRDBH/QZUA36Yxg7rNYxh6pbQ3yY1P Zr43zsBdX6hmExqBIwVw7dUog8fd1hFzzYfDAPBCH4nqqjUopUGN51J+UMFFtE+50Xzq zGiZXG0e/4EXIfhNFPOr9olJasS7aTe/vHZKY= Received: by 10.140.82.9 with SMTP id f9mr336887rvb.21.1272152812383; Sat, 24 Apr 2010 16:46:52 -0700 (PDT) X-BeenThere: lojban@googlegroups.com Received: by 10.141.188.20 with SMTP id q20ls40583201rvp.3.p; Sat, 24 Apr 2010 16:46:51 -0700 (PDT) Received: by 10.140.83.35 with SMTP id g35mr368542rvb.0.1272152811530; Sat, 24 Apr 2010 16:46:51 -0700 (PDT) Received: by 10.140.83.37 with SMTP id g37mr361678rvb.11.1272148998087; Sat, 24 Apr 2010 15:43:18 -0700 (PDT) Received: by 10.140.83.37 with SMTP id g37mr361677rvb.11.1272148998068; Sat, 24 Apr 2010 15:43:18 -0700 (PDT) Received: from sdf.lonestar.org (mx.freeshell.org [192.94.73.19]) by gmr-mx.google.com with ESMTP id 19si405709pzk.15.2010.04.24.15.43.17; Sat, 24 Apr 2010 15:43:17 -0700 (PDT) Received-SPF: neutral (google.com: 192.94.73.19 is neither permitted nor denied by best guess record for domain of jwodder@sdf.lonestar.org) client-ip=192.94.73.19; Received: from sdf.lonestar.org (IDENT:jwodder@sverige.freeshell.org [192.94.73.4]) by sdf.lonestar.org (8.14.4/8.14.3) with ESMTP id o3OMhHau005630 for ; Sat, 24 Apr 2010 22:43:17 GMT Received: (from jwodder@localhost) by sdf.lonestar.org (8.14.4/8.12.8/Submit) id o3OMhHvl007661 for lojban@googlegroups.com; Sat, 24 Apr 2010 22:43:17 GMT Date: Sat, 24 Apr 2010 22:43:17 +0000 From: Minimiscience To: lojban@googlegroups.com Subject: Re: [lojban] scoreGismu Message-ID: <20100424224315.GA24822@sdf.lonestar.org> Mail-Followup-To: lojban@googlegroups.com References: <201004241526.17049.phma@phma.optus.nu> MIME-Version: 1.0 In-Reply-To: Organization: SDF Public Access UNIX System User-Agent: Mutt/1.5.20 (2009-06-14) X-Original-Authentication-Results: gmr-mx.google.com; spf=neutral (google.com: 192.94.73.19 is neither permitted nor denied by best guess record for domain of jwodder@sdf.lonestar.org) smtp.mail=jwodder@sdf.lonestar.org X-Original-Sender: minimiscience@gmail.com Reply-To: lojban@googlegroups.com Precedence: list Mailing-list: list lojban@googlegroups.com; contact lojban+owners@googlegroups.com List-ID: List-Post: , List-Help: , List-Archive: Sender: lojban@googlegroups.com List-Subscribe: , List-Unsubscribe: , Content-Type: text/plain; charset=ISO-8859-1 Content-Disposition: inline de'i li 24 pi'e 04 pi'e 2010 la'o fy. tijlan .fy. cusku zoi skamyxatra. > It doesn't always score with the same result as for an official gismu. For > instance, the gismu with the highest score of 0.4767... in regard to the > above source words is "mlino" according to this program, not "pilno", even > with the older weighting: > > iun 0.36 emploi 0.21 upiog 0.16 us 0.11 primin 0.09 amal 0.07 .skamyxatra That's because the script doesn't implement step 2b of the {gismu} creation algorithm correctly. This can be fixed by changing lines 114-118 from: > if($gismu =~ /$lcs[0].?$lcs[1]/) { > if($word =~ /$lcs[0].?$lcs[1]/) { > $score += $weight * 2 / @wordSequence; > } > } to: > $score += $weight * 2 / @wordSequence > if $gismu =~ /$lcs[0]$lcs[1]/ && $word =~ /$lcs[0]$lcs[1]/ > || $gismu =~ /$lcs[0].$lcs[1]/ && $word =~ /$lcs[0].$lcs[1]/ You'll also need to have an empty file in place of gismu.txt in order for extant {gismu} (and anything they conflict with) to be considered for scoring. mu'omi'e .kamymecraijun. -- mi na se finti fi tu'a lo vi munje -- You received this message because you are subscribed to the Google Groups "lojban" group. To post to this group, send email to lojban@googlegroups.com. To unsubscribe from this group, send email to lojban+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/lojban?hl=en.