[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lojban] Historical "finprims" gismu algorithm weights and scores



On 3/6/2014 8:58 PM, Robert LeChevalier wrote:
Update: I have two "final" versions of the program, in source and
executable, but cannot recall what the difference is.  The first was
almost certainly used for all the 1987 prim runs, while we may have used
the second one for the words added later.

The two programs differ only in a couple of lines. Because most Chinese source words were 2-3 letters, and Russian on the other extreme often had words that were much longer, possibly as many as 10 characters, we tried normalizing all inputs as if they were 5 characters long, so that a Chinese 2/2 character match would get weighted the much lower 2/5 and a Russian 10 characters with 5 matching characters would get 5/5 rather than 5/10. I don't recall whether we used these altered weightings or just did trials to see the difference. If we did, it would show up in the words made after 1987. But if we did, I might not have noted this in Finprims.

We tried other experiments to improve the results, but I haven't found them.

lojbab

--
You received this message because you are subscribed to the Google Groups "lojban" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lojban+unsubscribe@googlegroups.com.
To post to this group, send email to lojban@googlegroups.com.
Visit this group at http://groups.google.com/group/lojban.
For more options, visit https://groups.google.com/d/optout.