[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lojban] Volunteering for dictionary work

To: lojban@egroups.com
Subject: Re: [lojban] Volunteering for dictionary work
From: "Bob LeChevalier (lojbab)" <lojbab@lojban.org>
Date: Mon, 25 Sep 2000 16:06:42 -0400
In-reply-to: <3.0.5.32.20000925123823.0096b1e0@pop.stud.ntnu.no>

At 12:38 PM 09/25/2000 +0200, Arnt Richard Johansen wrote:

I've looked through
<http://www.lojban.org/files/draft-dictionary/Working/>, and I'm
considering volunteering for preparing lujvo for the dictionary.  I have a
few questions though, as to what needs to be done, and how.

1. Is the task to write keywords and place structures of new lujvo,

I'd like to start with keywords for all the words, with place structuresslightly less priority but still desired. People might be unsure of how todo the place structures (which takes experience in order to do withconfidence, and even then might have problems given that we have donelittle cross-checking of place structures by different authors to see if weare doing them consistently). If we have keywords then we can semanticallygroup similar concepts which will help in that place structure checking aswell as allow us to decide which words are worth including in the dictionary.

 in the
same format as the current computerized lujvo list
(http://www.lojban.org/files/draft-dictionary/NORALUJV.txt)?

Yes. The closer you come to the current format, the more automated will bethe process of putting it in some other form later if needed.

Or should all
lujvo, the new ones as well as the one already in the list, be written in a
new format, specifically for the paper dictionary?

We don't know what such a format would be. I think that people would rathersee a dictionary come out sooner with consistent definitional forms thattake a little decoding rather than have us delay a long while in order tohave very English-idiomatic definitional forms. I would rather includemore words defined accurately but less prettily, rather than fewer wordswith optimal definitions. With people coining new lujvo at a rate muchfaster than we can define them, speed in getting a good look-up dictionaryto help people find a word if it has already been coined would be ablessing (it would also greatly enhance glossers to have glosses for anumber of lujvo, which requires keywording more than place structures).

One would think that
lujvo definitions should be written out in full (as is done in the
computerized gismu list), instead of summarily referring to gismu places.

The computerized gismu list took years and many review passes to get whereit is today. It was an incredibly time consuming process, and there arealready several times as many lujvo proposals as there are gismu.

For instance, in the lujvo list, "cabdei" occurs like this:
cabdei cabna+djedi: today: x1 = djedi1(full
day) = cabna1 (now), x2 = cabna2 (co-occurred with), x3 = djedi3 (full  day
standard)

But shouldn't it be changed to look like this in an "ordinary" dictionary:
cabdei cabna+djedi today x1 is the day that issimultaneous with x2, by
standard x3

We might adopt the policy of rewriting those lujvo that exceed a certainthreshold of usage (cabdei would be a likely candidate, as would brivla),but we aren't ready to decide.

Ideally, I would like the coding that the Book uses in presenting placestructures as analyzed (which we can process automatically into the form inthe lujvo list if it is done in a consistent format). See Nick's lujvolist to find a mass of words in the brief coded form. This makes it easierto check what someone else has done. The second form you present has lostthe analysis information, thus requiring someone checking you work to lookup the place structures of the source gismu and perform the analysisindependently without your work as a clue, in order to check to see if s/heagrees with what you have come up with. And that checking will have to bedone at least a couple of times before we put the word in thedictionary. So save the pretty wording for later (if ever).

2. How should we find out (ma ve djuno) the meanings of the lujvo that
haven't been defined yet?

If you KNOW what it means, as in this case because you used it, say whatyour intent/understanding was in using it, and feel free to note in whatyou submit that you actually used the word that way. How a word hasactually been used is more valuable a guideline than the analytical opinionof someone who is doing a chunk of 100 words that he never saw before helooked at the lujvo list. You might have made some mistakes in yourcoinings, but then by your annotation I would expect a higher standard ofargument to justify a different meaning than you intended.

  As an example, take the word "vlatai", which has
occurred relatively often in the text corpus (34 times).  I distinctly
remember using that particular word in a conversation with Jorge on the
list, intending it to mean "x1 is an inflected form of word/lexeme x2,
yielding meaning x3".

Then put that down with a note saying that this was your intent when usingit, with keyword "inflected form". Later place structure analysis may comeup with a different result, but if you used it a certain way, then thatshould guide the place structure analysis.

 Now, since I only have the eGroups archives handy,
it is difficult for me to find enough usage of it, so that I can be sure
that my interpretation of the word is indeed the most correct.

Correctness is a relative thing when we as yet have no standard (the pointis to make a standard). I am not expecting everyone to do an archivesearch for each word. For keyword analysis, I would be happy to have abest guess for all the words. Then people can look at others' proposedkeywords and see if they agree. We can do an archive search later for thewords for which there is some uncertainty (and there is enough usage thatwe are likely to be able to have usage resolve the issue).

In any event it will be a multi-pass analysis. Nora has already found thatit is impossible to maintain consistency over an analysis of even 1000words, and we are getting closer to 10,000. So I want to build multiplepasses by multiple people into the approach to defining the words, so as tocatch the most consistency errors possible with the least effort.

If you do 50 words superbly, you are unlikely to notice any consistencyerrors. If you do 500 words in multiple passes that take less time on eachword as you go, you will end up correcting yourself sometimes on a laterpass, but you will feel more productive and your result will be far moreuseful. And if you have to quit after doing a large chink of wordspartially, someone else can take over and do the next step, performing aconsistency check as THEY go.


lojbab
--
lojbab                                             lojbab@lojban.org
Bob LeChevalier, President, The Logical Language Group, Inc.
2904 Beau Lane, Fairfax VA 22031-1303 USA                    703-385-0273
Artificial language Loglan/Lojban:                 http://www.lojban.org

References:
- [lojban] Volunteering for dictionary work
  - From: Arnt Richard Johansen <arntrich@stud.ntnu.no>

Prev by Date: Re: [lojban] Volunteering for dictionary work
Next by Date: Re: [lojban] Re: Get Much Ca$h !
Previous by thread: [lojban] Volunteering for dictionary work
Next by thread: [lojban] beggars
Index(es):
- Date
- Thread