Received: from VMS.DC.LSOFT.COM (vms.dc.lsoft.com [205.186.43.2]) by locke.ccil.org (8.6.9/8.6.10) with ESMTP id RAA07023 for ; Tue, 27 Feb 1996 17:07:09 -0500 Message-Id: <199602272207.RAA07023@locke.ccil.org> Received: from PEACH.EASE.LSOFT.COM (205.186.43.4) by VMS.DC.LSOFT.COM (LSMTP for OpenVMS v1.0a) with SMTP id ED77F7A3 ; Tue, 27 Feb 1996 16:05:50 -0500 Date: Mon, 26 Feb 1996 21:30:51 +0200 Reply-To: Veijo Vilva Sender: Lojban list From: Veijo Vilva Subject: ADM: Test database of Lojban List postings X-To: lojban@cuvmb.cc.columbia.edu To: John Cowan X-Mozilla-Status: 0001 Content-Length: 1354 X-From-Space-Date: Wed Feb 28 13:57:36 1996 X-From-Space-Address: - I have built a searchable database of the nearly 4000 postings to the lojban List between Sep1994 and Feb1996, some 10Mb. The postings are not threaded and the HTML format indexes are prohibitively large (1995 is 315kb, don't use Netscape with less than 32Mb :-). Keyword and logical searches are OK -- unless you try for all postings by Jorge, the resulting index may cause a crash or lock-up on lesser systems. I'm using Glimpse as a search engine. The full text searches are not lightning fast: 'Derzhanski' took about 23s, 'veion' about 45s. Actually the search itself is quite fast, of the order of 1-2s, but the preparation of the output seems to take lots of time when there are many matching postings. Note: the logical searches are of limited utility, at least ANDs, because the words involved must fall on the same line of text. This is a limitation of Glimpse. I'll try to modify the scripts so that there are two sets of files: one set for indexing and one for display. This is necessary because the mailings cannot be reformatted. The database is not presently accessible from the home page. The URL is http://xiron.pc.helsinki.fi/lojban/hma/ ------------------------ co'o mi'e veion --------------------------------- .i mi du la'o sy. Veijo Vilva sy. ---------------------------------