Return-Path: X-Original-To: phma@localhost Delivered-To: phma@localhost Received: from chausie.ixazon.lan (localhost [127.0.0.1]) by chausie (Postfix) with ESMTP id 457081B9C for ; Fri, 18 Sep 2009 17:36:58 -0400 (EDT) Delivered-To: phma@phma.optus.nu Received: from 192.168.7.2 [192.168.7.2] by chausie.ixazon.lan with IMAP (fetchmail-6.3.8) for (single-drop); Fri, 18 Sep 2009 17:36:58 -0400 (EDT) Received: from mail-yx0-f161.google.com (mail-yx0-f161.google.com [209.85.210.161]) by ixazon.dynip.com (Postfix) with ESMTP id 67FC3CE771 for ; Fri, 18 Sep 2009 17:35:12 -0400 (EDT) Received: by yxe33 with SMTP id 33so74327yxe.17 for ; Fri, 18 Sep 2009 14:35:11 -0700 (PDT) Dkim-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=beta; h=domainkey-signature:received:received:x-sender:x-apparently-to :received:received:received:received-spf:received:dkim-signature :domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type:reply-to:sender :precedence:x-google-loop:mailing-list:list-id:list-post:list-help :list-unsubscribe:x-beenthere-env:x-beenthere; bh=YLR7E1mqun1Nct6UR/WSmFww+MvJqBAbwypyYH2Lpf0=; b=wOToc9SQ+hhzpqTOZWwuVu1L4tH/Uj8B/hqV/eiZQ3+lD8Za1/N+RjyJk2EDabqDrH EWp20A0FCjNMqV1lVlnBoO3oUz+Yt4lcN1uilZx1wHyH27/fKEn3TxhlXfhYTFfjApON qG+OOq+j21EYWh8UMUss7HMJS6r/Vr8meJO80= Domainkey-Signature: a=rsa-sha1; c=nofws; d=googlegroups.com; s=beta; h=x-sender:x-apparently-to:received-spf:authentication-results :dkim-signature:domainkey-signature:mime-version:in-reply-to :references:date:message-id:subject:from:to:content-type:reply-to :sender:precedence:x-google-loop:mailing-list:list-id:list-post :list-help:list-unsubscribe:x-beenthere-env:x-beenthere; b=W77TuMPMRbPwZs/FsLV9/1xiplHfjzPCubioSNzOXEX4Vp/ZkRr9XFJGmkB+JhDOfz 30Ssk3sP4MbP45kcwQODx4KQPMSU5tXvj/xRkRusr51DaeaEINfgPwLNCnVFHhM+dsAx rwv3f/XYuB0+26xPGSGu1yM3KZKK2qRkqVuBg= Received: by 10.90.141.10 with SMTP id o10mr309470agd.14.1253309711063; Fri, 18 Sep 2009 14:35:11 -0700 (PDT) Received: by 10.177.5.4 with SMTP id h4gr34yqi.0; Fri, 18 Sep 2009 14:35:11 -0700 (PDT) X-Sender: jjllambias@gmail.com X-Apparently-To: lojban-lbck@googlegroups.com Received: by 10.101.27.11 with SMTP id e11mr1036736anj.11.1253309710746; Fri, 18 Sep 2009 14:35:10 -0700 (PDT) Received: by 10.101.27.11 with SMTP id e11mr1036735anj.11.1253309710726; Fri, 18 Sep 2009 14:35:10 -0700 (PDT) Received: from mail-yw0-f184.google.com (mail-yw0-f184.google.com [209.85.211.184]) by gmr-mx.google.com with ESMTP id 19si130195ywh.6.2009.09.18.14.35.09; Fri, 18 Sep 2009 14:35:09 -0700 (PDT) Received-SPF: pass (google.com: domain of jjllambias@gmail.com designates 209.85.211.184 as permitted sender) client-ip=209.85.211.184; Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of jjllambias@gmail.com designates 209.85.211.184 as permitted sender) smtp.mail=jjllambias@gmail.com; dkim=pass (test mode) header.i=@gmail.com Received: by mail-yw0-f184.google.com with SMTP id 14so1846444ywh.26 for ; Fri, 18 Sep 2009 14:35:09 -0700 (PDT) Dkim-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=68dwo1zMSDRFqYB5V4ZwD4uaklZDdRMmxI5QhKirLrY=; b=qt8KEQ3yNtuKMbKVLmEucywsyfqOuQ1+lBMPKAyrw+i+AzRczboYl9YOEdiraFdb5f VnM5ckMy5uShcwtSlc8F3nAjmbg8cS6wvFSJRdISl948M5h7sUDa2MbUsQif8y8zmIwH 3OIvG3Ckb9M/sS5IPrWsxTauLTpFYNfjWOLYI= Domainkey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=sAeYNZ3k65ZnJDYgaTN4+LWm2lvgpZF6+uDO7Te5EfFEatQR84KlBQOjM5vhngv7of 6uXdjiCiVrBiPRJxoLI2/y3N9Jk7b3vGj8RSv1rE/GUXniPK+VQgYrJTnkw4f4QVLy/v mVxuPjyOCPYS08nTp/T3e93XenyWnlDQ6UFlU= MIME-Version: 1.0 Received: by 10.91.28.9 with SMTP id f9mr1349766agj.89.1253309709444; Fri, 18 Sep 2009 14:35:09 -0700 (PDT) In-Reply-To: References: <00c09f99e242d97fd6047299a5fa@google.com> <925d17560909021123w55db2248xb498ff9fae0bfcc1@mail.gmail.com> Date: Fri, 18 Sep 2009 18:35:09 -0300 Message-ID: <925d17560909181435td22d924gcfb8369991515d4b@mail.gmail.com> Subject: Re: Lojban Certification Program From: =?ISO-8859-1?Q?Jorge_Llamb=EDas?= To: lojban-lbck@googlegroups.com Content-Type: text/plain; charset=ISO-8859-1 Reply-To: lojban-lbck@googlegroups.com Sender: lojban-lbck@googlegroups.com Precedence: bulk X-Google-Loop: groups Mailing-List: list lojban-lbck@googlegroups.com; contact lojban-lbck+owner@googlegroups.com List-ID: List-Post: List-Help: List-Unsubscribe: , X-Beenthere-Env: lojban-lbck@googlegroups.com X-Beenthere: lojban-lbck@googlegroups.com Content-Length: 3499 On Fri, Sep 18, 2009 at 2:30 PM, Matt Arnold wrote: > > I think the question is whether to use the most common 500 words, or > weight it in favor of cmavo. In favor or against cmavo? I think it is cmavo that are overrepresented in the initial segment. In the first 100 words there are only 18 gismu. It's pretty hard to construct sentences which use 82 cmavo but are constrained to only 18 gismu. > I still think 500 is too many. How many of you agree? These are the top 50 cmavo from http://teddyb.org/~rlpowell/hobbies/lojban/flashcards/cmavo_freq le 11208 .i 7438 mi 3324 cu 3253 nu 3034 do 2470 la 2319 se 2057 lo 2034 lu 1944 li'u 1933 coi 1669 na 1398 be 1199 gi'e 1161 sei 1154 ca 1098 ro 967 ma 751 go'i 749 noi 725 ku'i 644 nai 640 fi 633 lei 631 kei 624 da 614 .a 613 du'u 568 xu 567 pu 561 ko 542 bu 528 .e 525 ka 522 ba 516 je 506 loi 487 zo 463 doi 449 poi 447 je'e 380 te 374 di'u 367 no 365 pa 361 bo 345 pe 340 vi 337 co'a 336 But we probably need to do some fiddling. For example, "no" and "pa" are the only numbers that made it to the top 50, but I think all numbers should be tested in the first level. The only FA that made it is "fi". It's reasonable that "fi" is the most frequent, but fa/fe/fi/fo/fu are learned together and should be tested together, so if "fi" is included they should all be (they might be left for the second level). Similarly for se/te/ve/xe. Some of them I think we can safely exclude, like "sei", which is there because of the frequent "sei X cusku" especially in the Alice translation. Also lu-li'u maybe need not be included. (But I would include "zo", especially if we include "cmene". We can't use "cmene" without "zo".) Mark's proposed list also has about 50 cmavo by my count, and it has much overlap with the above list, as expected, but also some differences: << lo, la, cu, mi, do, ti, ta, tu, and some other KOhA, nu, ka, ni, all of SE, ca, pu, ba, NOI, GOI, .i, A,... ku, kei and when they're needed, and cu as mentioned above. A small selection of UI/CAI and COI (and DOI) Numerals no-so and base-10 construction, perhaps also ro. >> I think some 50 cmavo is about right for the first level. Then there should be some cmevla, not too many but in any case cmevla are easy as they don't need to be memorized, just recognized, and they are one of the first things people learn anyway, so I don't think we need to worry about how many of them we include. And then some reasonable number of gismu that allow us to write meaningful sentences. These are the top 50 gismu from http://teddyb.org/~rlpowell/hobbies/lojban/flashcards/gismu_freq cusku 1295 mutce 388 klama 305 zvati 287 cmalu 277 tavla 250 viska 241 drata 236 djuno 219 pensi 219 catlu 217 nelci 202 barda 200 djica 197 gunka 193 cliva 190 pilno 171 cmene 168 jimpe 166 prenu 164 troci 151 xamgu 146 kumfa 143 citka 136 valsi 136 tirna 129 sutra 127 zdani 126 facki 125 ciska 124 stedu 124 pluta 123 nenri 122 cizra 120 ractu 119 simlu 118 xruti 118 drani 116 jitfa 111 voksa 111 dukse 109 krixa 109 tsali 109 jundi 108 Again we will probably need to do adjustements, but we won't know which ones until we start producing the questions. We could start with that list and then add/substract words as needed. I would not include fu'ivla in the first level. A few lujvo perhaps yes, but unfortunately I can't open the lujvo frequency list to get some idea what the most frequent are. Probably things with sel-, nun-, -gau, and such. mu'o mi'e xorxes