[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lojban] lojban.org transfer, reprise.



On Fri, 15 Mar 2002, Jim Carter wrote:

> On Thu, 14 Mar 2002, Jay Kominek wrote:
> > Just a heads up, mail archiving software sucks.
>
> Glimpse produces indices that are about 10% the size of the indexed corpus.

Hmm. I had some other complaint with Glimpse. Wish I could remember it
now.

> I think htdig has similar characteristics, but I only tried it out once as
> a demo.  It's easy to set up, but it's oriented to a corpus that's already
> directly web accessible, vs. files to be spit out by a CGI script.

I use htDig for the old list archives.

jkominek@balance ~/public_html/lojban $ du -sh db*
42M     db.docdb
1.3M    db.docs.index
64M     db.wordlist
58M     db.words.db
=
165.3MB

Whereas the data being indexed is 62.3MB. (Sorry, misremembered.)


It'd be sort of nice to have access to something like Thunderstone Texis
for this, even though I hear it is a complete pain.

- Jay Kominek <jay.kominek@colorado.edu>
  Plus ça change, plus c'est la même chose