[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Behold the corups app (was Re: [lojban] HISTORIAN: What's up with this file?)
On Tue, Aug 30, 2011 at 09:20:15AM -0400, Robert LeChevalier wrote:
> Jorge Llambías wrote:
> >On Tue, Aug 30, 2011 at 4:18 AM, Robin Lee Powell
> ><rlpowell@digitalkingdom.org> wrote:
> >
> >>Attached you'll find two gismu lists that appear to differ only
> >>in the weird columns in the middle; for example, in one (the
> >>normal gismu list that we've been providing all these years)
> >>bacru has "1h 386", attach as gu2. In the other, which I found
> >>in a weird spot on the web server as gismu_updated.txt, attached
> >>as gu1, has bacru with "1h 405".
> >>
> >>Any idea what's going on there? gismu_updated.txt seems to have
> >>been last touched in 2002, and might easily be something I
> >>hacked together or something? I dunno.
> >
> >
> >I seem to remember those numbers were frequencies, so they
> >probably correspond to an updated corpus.
>
> That is correct. I don't recall what the reason for doing it was,
> but the gu1 list seems to be a later set of counts.
>
> It would seem likely that the usage since 2002 probably dwarfs
> what went before, so someone might want to generate new counts
> based on the corpora online, but I suggest in future that they be
> a separate file from the official gismu list (in a different
> format, so this doesn't come up again)
http://www.lojban.org/cgi-bin/corpus/
That is one of best things that anyone has done around here at my
request. Take a bow, purpleposeidon. :)
From there, scripting a frequency counter (as I have) is pretty
trivial.
-Robin
--
http://singinst.org/ : Our last, best hope for a fantastic future.
Lojban (http://www.lojban.org/): The language in which "this parrot
is dead" is "ti poi spitaki cu morsi", but "this sentence is false"
is "na nei". My personal page: http://www.digitalkingdom.org/rlp/
--
You received this message because you are subscribed to the Google Groups "lojban" group.
To post to this group, send email to lojban@googlegroups.com.
To unsubscribe from this group, send email to lojban+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/lojban?hl=en.