Return-Path: <@SEGATE.SUNET.SE:LOJBAN@CUVMB.BITNET> Received: from SEGATE.SUNET.SE by xiron.pc.helsinki.fi with smtp (Linux Smail3.1.28.1 #1) id m0sXaSe-0000ZFC; Sun, 16 Jul 95 23:38 EET DST Message-Id: Received: from segate.sunet.se by SEGATE.SUNET.SE (LSMTP for OpenVMS v0.1a) with SMTP id 4C3C2E0A ; Sun, 16 Jul 1995 22:38:12 +0200 Date: Sun, 16 Jul 1995 21:29:21 EDT Reply-To: Csg0070@QUEENS-BELFAST.AC.UK Sender: Lojban list From: "Ciarn Duibhn" Subject: Vocabulary structure X-To: lojban@cuvmb.columbia.edu To: Veijo Vilva Content-Length: 1193 Lines: 21 I am (possibly) interested in Lojban, to find out if its vocabulary can be a useful way of indexing text in a database. I mean, could one take a sentence of text in a natural language (English?) and think up a small number of Lojban words which represent its meaning (the words of a Lojban translation of the sentence, possibly), and use an index of these Lojban words to find sentences in the English text which contain a particular concept? The problem with an index based directly on the English words (or on a translation into another natural language) is not just with inflection etc but more importantly with the large number of synonyms and near-synonyms, and terms more or less semantically related, i.e. there are so many ways to say nearly the same thing in an NL. Can anyone say if the vocabulary of Lojban simplifies or systematises the semantic relationships between words in such a way that they would be more effective as an index? A second question: in my efforts to find out something about Lojban vocabulary I downloaded a file called lf1293.zip (I think) from ftp.cs.yale.edu. What program do I need to use the files contained in this zip archive? Ciara/n O/ Duibhi/n.