From jay.kominek@colorado.edu Tue Apr 24 22:02:34 2001 Return-Path: X-Sender: kominek@ucsub.colorado.edu X-Apparently-To: lojban@yahoogroups.com Received: (EGP: mail-7_1_2); 25 Apr 2001 05:02:33 -0000 Received: (qmail 66196 invoked from network); 25 Apr 2001 05:02:32 -0000 Received: from unknown (10.1.10.142) by l9.egroups.com with QMQP; 25 Apr 2001 05:02:32 -0000 Received: from unknown (HELO ucsub.colorado.edu) (128.138.129.12) by mta3 with SMTP; 25 Apr 2001 05:02:24 -0000 Received: from ucsub.colorado.edu (kominek@ucsub.colorado.edu [128.138.129.12]) by ucsub.colorado.edu (8.11.2/8.11.2/ITS-5.0/student) with ESMTP id f3P52NZ05297 for ; Tue, 24 Apr 2001 23:02:23 -0600 (MDT) Date: Tue, 24 Apr 2001 23:02:23 -0600 (MDT) To: Subject: web dictionary development (was: Re: [lojban] NickFest 2) In-Reply-To: <4.3.2.7.2.20010424161239.00ad1100@127.0.0.1> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII From: Jay Kominek X-Yahoo-Message-Num: 6906 On Tue, 24 Apr 2001, Bob LeChevalier (lojbab) wrote: > Fine. But I have to admit that when I do this work, I do it several > hundred words at a time, and I myself would hate to fill out forms on a > word by word basis. Maybe that is why I am skeptical - the only way these > projects have gotten done in the past has been for people to put in hours > doing chunks of a couple hundred words at a shot. I'll happily develop a bulk entry method if people will use it. It would likely be easier than the web form, actually. A straight ASCII file with a bunch of copies of a single form in it would be quite easy. This kind of software is sufficently trivial that the limiting factor in development is how fast I can type (and how long I can go without being distracted). If anyone has any features or wishes for this system, by all means, let me know. Again, the URL for the currently under development system is at: http://wiw.org/~jkominek/jbovlaste/ Its very rough around the edges, the database is mostly empty, and it doesn't have the ability to generate pretty forms of the dictionary containing only the parts you want, but it shows what I'm planning on. One feature that can't be seen is that it can maintain a log of changes for words. (Sadly not in a space efficent manner like CVS. Oh well.) > >For the record, it's the lujvo that Jay's site was originally designed > >for, although I think we were both under the impression that you wanted > >more of them, which doesn't appear to be the case. Actually I intend my system to be usable for all catagories of words, since it can also handle definitions in an arbitrary number of languages. This system does not only help with the preparation of an English/Lojban dictionary, it will help with the preparation of Lojban to English, French, German, Russian, Elbonian, Esperanto etc etc dictionaries. (And a Lojban dictionary in Lojban.) (For reference, the word catagories that it supports are: gismu, cmavo, lujvo, fu'ivla, cmene (the dictionary I used as a kid listed famous people in the back), nonstandard gismu, and experimental cmavo) > I have always been on record that what we want is to see the words that > actually get used in the language defined. We do need some more semantic > coverage in some areas, but it is premature to figure what areas these are > when we can't determine what words we already have. Hrm. Maybe I should incorporate a feature to specify to what gismu each word has a semantic connection. I think a couple of people looking over and discussing the construction of a lujvo is as good as grabbing it from a single person's in-context usage. > > > The new searchable > > > archive of Lojban List is proving extremely helpful in making the > > > context searches, and I am hoping that the yahoogroups archives can be > > > added in to that archive somehow to make things even easier for newly > > > made words. I can't incorporate all the yahoogroups stuff into my searchable archive. The disk space is really amazing, and the time to run all those messages through is prohibitive. (If I could run it on a box with a gig or two of RAM, it'd be a bit more reasonable. Or I could just write my own message archiving software, I suppose. That's on my master todo, anyways.) I could at least produce a list of all the lujvo in the archive I made, with links to each occurance (or the first, anyways). Just let me know what is constructive to do. - Jay Kominek The grass is always greener, On the other side of the dimensional barrier.