[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Lojban Certification Program



Oops, I keep doing that. Sorry. I meant weighted against cmavo.

I agree about including those related sets in the same level. I would
like to see:

no/pa/re/ci/vo/mu/xa/ze/bi/so in level 1
fi/fe/fa/fo/fu in level 2
se/te/ve/xe in level 2

-Matt


2009/9/18 Jorge LlambÃas <jjllambias@gmail.com>:
>
> On Fri, Sep 18, 2009 at 2:30 PM, Matt Arnold <matt.mattarn@gmail.com> wrote:
>>
>> I think the question is whether to use the most common 500 words, or
>> weight it in favor of cmavo.
>
> In favor or against cmavo? I think it is cmavo that are
> overrepresented in the initial segment. In the first 100 words there
> are only 18 gismu. It's pretty hard to construct sentences which use
> 82 cmavo but are constrained to only 18 gismu.
>
>> I still think 500 is too many. How many of you agree?
>
> These are the top 50 cmavo from
> http://teddyb.org/~rlpowell/hobbies/lojban/flashcards/cmavo_freq
>
> le   Â11208
> .i   Â7438
> mi   Â3324
> cu   Â3253
> nu   Â3034
> do   Â2470
> la   Â2319
> se   Â2057
> lo   Â2034
> lu   Â1944
> li'u  Â1933
> coi   1669
> na   Â1398
> be   Â1199
> gi'e  Â1161
> sei   1154
> ca   Â1098
> ro   Â967
> ma   Â751
> go'i  Â749
> noi   725
> ku'i  Â644
> nai   640
> fi   Â633
> lei   631
> kei   624
> da   Â614
> .a   Â613
> du'u  Â568
> xu   Â567
> pu   Â561
> ko   Â542
> bu   Â528
> .e   Â525
> ka   Â522
> ba   Â516
> je   Â506
> loi   487
> zo   Â463
> doi   449
> poi   447
> je'e  Â380
> te   Â374
> di'u  Â367
> no   Â365
> pa   Â361
> bo   Â345
> pe   Â340
> vi   Â337
> co'a  Â336
>
> But we probably need to do some fiddling. For example, "no" and "pa"
> are the only numbers that made it to the top 50, but I think all
> numbers should be tested in the first level. The only FA that made it
> is "fi". It's reasonable that "fi" is the most frequent, but
> fa/fe/fi/fo/fu are learned together and should be tested together, so
> if "fi" is included they should all be (they might be left for the
> second level). Similarly for se/te/ve/xe. Some of them I think we can
> safely exclude, like "sei", which is there because of the frequent
> "sei X cusku" especially in the Alice translation. Also lu-li'u maybe
> need not be included. (But I would include "zo", especially if we
> include "cmene". We can't use "cmene" without "zo".)
>
> Mark's proposed list also has about 50 cmavo by my count, and it has
> much overlap with the above list, as expected, but also some
> differences:
> <<
> lo, la, cu, mi, do, ti, ta, tu, and some other KOhA, nu, ka, ni, all
> of SE, ca, pu, ba, NOI, GOI, .i, A,...
> ku, kei and when they're needed, and cu as mentioned above.
> A small selection of UI/CAI and COI (and DOI)
> Numerals no-so and base-10 construction, perhaps also ro.
>>>
>
> I think some 50 cmavo is about right for the first level. Then there
> should be some cmevla, not too many but in any case cmevla are easy as
> they don't need to be memorized, just recognized, and they are one of
> the first things people learn anyway, so I don't think we need to
> worry about how many of them we include. And then some reasonable
> number of gismu that allow us to write meaningful sentences.
>
> These are the top 50 gismu from
> http://teddyb.org/~rlpowell/hobbies/lojban/flashcards/gismu_freq
>
> cusku  1295
> mutce  388
> klama  305
> zvati  287
> cmalu  277
> tavla  250
> viska  241
> drata  236
> djuno  219
> pensi  219
> catlu  217
> nelci  202
> barda  200
> djica  197
> gunka  193
> cliva  190
> pilno  171
> cmene  168
> jimpe  166
> prenu  164
> troci  151
> xamgu  146
> kumfa  143
> citka  136
> valsi  136
> tirna  129
> sutra  127
> zdani  126
> facki  125
> ciska  124
> stedu  124
> pluta  123
> nenri  122
> cizra  120
> ractu  119
> simlu  118
> xruti  118
> drani  116
> jitfa  111
> voksa  111
> dukse  109
> krixa  109
> tsali  109
> jundi  108
>
> Again we will probably need to do adjustements, but we won't know
> which ones until we start producing the questions. We could start with
> that list and then add/substract words as needed.
>
> I would not include fu'ivla in the first level. A few lujvo perhaps
> yes, but unfortunately I can't open the lujvo frequency list to get
> some idea what the most frequent are. Probably things with sel-, nun-,
> -gau, and such.
>
> mu'o mi'e xorxes
>