From lojbab@lojban.org Tue Apr 24 11:09:45 2001
Return-Path: <lojbab@lojban.org>
X-Sender: lojbab@lojban.org
X-Apparently-To: lojban@yahoogroups.com
Received: (EGP: mail-7_1_2); 24 Apr 2001 18:09:44 -0000
Received: (qmail 694 invoked from network); 24 Apr 2001 18:09:44 -0000
Received: from unknown (10.1.10.26) by l7.egroups.com with QMQP; 24 Apr 2001 18:09:44 -0000
Received: from unknown (HELO stmpy-5.cais.net) (205.252.14.75) by mta1 with SMTP; 24 Apr 2001 18:09:43 -0000
Received: from bob.lojban.org (72.dynamic.cais.com [207.226.56.72]) by stmpy-5.cais.net (8.11.1/8.11.1) with ESMTP id f3OI9bh20707 for <lojban@yahoogroups.com>; Tue, 24 Apr 2001 14:09:37 -0400 (EDT)
Message-Id: <4.3.2.7.2.20010424020703.00bba800@127.0.0.1>
X-Sender: vir1036/pop.cais.com@127.0.0.1 (Unverified)
X-Mailer: QUALCOMM Windows Eudora Version 4.3.2
Date: Tue, 24 Apr 2001 04:52:26 -0400
To: lojban@yahoogroups.com
Subject: NickFest 2
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
From: Bob LeChevalier-Logical Language Group <lojbab@lojban.org>

Nick Nicholas visited Nora and myself last weekend, and as with NickFest 1 
last December, we had an eruption of Lojban activity that is rarely seen 
even in the active community of today. A lot got done on several backed-up 
projects, and some tentative decisions were made, which we hope that the 
community will agree with. We also have several tasks that we would like 
to farm out to volunteers in the community. Don't hesitate to volunteer.

As he has reported, Nick has taken over the "level 0 book" project that has 
long been languishing, and in the process it has become a significantly 
more robust if ambitious project. The original point of the project was to 
prepare a new form of level 0 (new Lojbanist) package which could either be 
printed relatively cheaply in mass-market book form, or which could be 
prepared via print-on-demand.

The existing package 0 consists of various Xeroxed documents: the brochure, 
the Overview, the Diagrammed Summary and the lastest issue of le lojbo 
karni. None of these documents have been revised in several years, and 
they contain minor out-of-date bits while having a second-rate appearance 
given modern technology. The now ancient newsletter, last put out in 1994, 
makes the project look moribund just by its date (people have complained 
because the official web site has not been updated in several months, but 
imagine the impression when our latest "news"letter is 7 years old. And 
finally, while the Diagrammed Summary can get people started in learning 
the language, we have always wanted to include the Lojban minilesson, 
originally written by Athelstan in 1991, but never revised because his 
near-fatal auto accident rendered him incapable of such Lojban work. The 
final straw in paralyzing our outreach to new Lojbanists has been that this 
hodgepodge of documents is expensive to produce and mail, and somewhat 
time-consuming to continually Xerox and assemble into "packages" for 
shipping, and we seldom took in the estimated $5 it costs to produce and 
ship them to USA correspondents, much less prospective Lojbanists from 
overseas.

The new level 0 package was to be a single book containing all of the 
relevant material except a newsletter, and as of last December, including a 
mildly revised version of Robin Turners introductory Lojban lessons, which 
the community seems to feel are at least as good as the old 
minilesson. John Cowan, our proven author, was going to take charge of 
this, but hasn't had the time. Enter Nick Nicholas, our fluent and again 
active Lojbanist who visited us in December and in a couple of days had 
moved the project a giant step forward.

Nick had to take a few months to finish his own book, and as he has 
announced in the last couple of weeks, has now launched into one of his 
patented high productivity efforts on Lojban that in the past produced the 
existing lujvo list, the conventions for place structures in the Book, and 
an enormous quantity of good Lojban text back in the era when few could 
produce a good paragraph in Lojban. Not to mention becoming the first 
recognized fluent Lojban speaker, though he has recently claimed that Goran 
Topic was fluent before he was.

Nick took advantage of a visit to the East Coast for other reasons, to drop 
in last weekend. Hence this note.

Nick has done a lot of work on the level 0 package, produced revisions of 
the first 4 of Robin Turner's lessons while identifying what will probably 
total around 12 lessons in the final package, rewritten the brochure, 
identified other items to be included in the book and assembled them. Nora 
has reviewed some of Nick's work as well as all of Robin's prior work, and 
the two of them worked hard on moving the effort along, with lojbab trying 
to stay out of the way (as dictated by the membership last LogFest, since 
they rightly felt that I was too over-committed to take on this effort).

Looking at the 4 lessons completed however, led us to realize that the 12 
lessons when printed in a small "mass-market" paperback size as we 
intended, would come close to 200 pages by themselves, and the rest of the 
level 0 package while less predictable because it is not well formatted, 
will likely be almost as long. A 400 page paperback is NOT the sort of 
thing we want to send to casual inquirers about the language, and would 
likely be too expensive. So the first tentative decision is to split the 
level 0 package into two books, one of which describes the language and the 
project, and the other being the introductory lessons, which standing alone 
will serve as a "light" textbook of the sort made famous in the "Teach 
Yourself" language series.

Nick's style of intense activity, and a rather short window that he has 
available to work on the project, make it likely that these two books will 
be completed in draft form before LogFest 2001, which right now is looking 
like it will be the last weekend in July (but more on this in another 
message), with publication soon after that. This will depend on people 
being speedy with their comments on the material Nick puts up on his site.

That will be the new level 0 package (we will likely be changing the names 
of these packages to ensure that there is no confusion).

The level 1 package has traditionally been a set of wordlists and the 
E-BNF. We pretty much have all the pieces needed to edit these lists into 
another book, which will be a "pocket dictionary" of Lojban. We aren't 
seeking a lot of new material (no more lujvo), though I want to do 
something to improve the cmavo list. Depending on time and money, this 
could come out by the end of this year.

The LogFest members meeting will decide whether the pocket dictionary and 
the intro lessons will constitute the baseline dictionary and textbook, 
starting the infamous "5 year freeze period", or whether we should wait 
until we publish a full dictionary and more thorough textbook, which 
projects may start moving along once these other books are done. I will 
admit to wanting to have the full package for the 5 year period myself, but 
it is not solely my decision. You should speak up before LogFest.

The level 2 package has been supplanted by the reference grammar, which has 
now hit 360 sales. There is a good chance of paying off the printer loan 
on the book this year (allowing publication of other books), and perhaps 
reaching the break-even point on that book next year - probably at around 
500-600 books depending on how one determines the break even sales level.

The level 3 package will be the full dictionary, which was the other major 
project moved forward during Nick's visit. I'll summarize that status by 
section.

The gismu list is of course baselined, and the English Lojban version of 
that list is nearly done and available as the draft dictionary file on the 
website "ENGDICT.GIS". I have a few English words (the most common ones) 
left unfinished, and the file needs to be weeded of duplicate entries, but 
these tasks can be completed with another good several-week burst of 
activity on my part.

A major hold up on the dictionary has been deciding what to do about cmavo, 
and this also affects the pocket dictionary. The existing cmavo list does 
not actually define most of the words, and the keywords were designed for 
LogFlash, not as proper definitions. We thus have three tasks, and I am 
willing to farm these out to volunteers in whole or in part.

For the Lojban-English side, we need for each cmavo and compound, English 
keywords if appropriate, a short definition if the word is definable, the 
selma'o (already there), and a pointer to the reference grammar section(s) 
discussing the word or the selma'o. Even a Lojban beginner with a copy of 
the book can work on the latter (and you might learn a lot about the 
language in the process), since it mostly involves looking stuff up. But 
don't volunteer unless you think you can commit enough time to do most or 
all of the either the lookup or the definition task within the next few 
months on your own - we can't afford a coordinator, and even the CVS option 
that Robin is working on seems inappropriate for this because consistency 
in style and look-up strategy/coverage is important and editing that sort 
of thing done by several people might take as long as doing it ourselves.

The second task is to go through the rather large accumulation of cmavo 
compounds that have actually been used and decide which of them have a 
simple English definition, and prepare them as per the above with keywords, 
definitions and book references. The existing set of compounds in the 
cmavo list was determined arbitrarily when we set up LogFlash and there are 
hundreds of other compounds to be considered. We'll concentrate on the 
most frequently used. A compound like "lenu" will either be skipped, or 
defined simply as le + nu possibly with a refgrammar reference. I can do a 
first cut at weeding these, or someone can volunteer, but there is no sense 
in starting this while the current cmavo list remains unfinished.

The third task will be to prepare English to Lojban entries for 
cmavo. Part of this job will use the results of the above tasks - using 
keyword processing as we used for the gismu list, and formatting the 
resulting entries to look like the others.

In addition we will seek a list of English "function words" which most 
likely will map to cmavo, and for each one, prepare the following: a list 
of any Lojban cmavo or other valsi or phrases that might be considered in 
translating the function word as well as any exact Lojban equivalents, and 
a pointer to the refgrammar if relevant. We may use JCB's prior TLI Loglan 
work to get ideas for English words to be included. This will be a manual, 
fairly time-consuming process, but is really necessary for the full 
dictionary though not vital for the pocket dictionary.

The next area to be worked on are fu'ivla, which have hardly been 
tackled. Nick advocates our collecting a fairly large set of easily made 
fu'ivla for plants and animals, (and perhaps other common international 
science words and foods), using the Linnean genus for each animal in the 
Latin ablative case (which gives a consistent vowel ending.

Nick believes that in most cases this can be made trivially by finding out 
Gode's Interlingua word for the plant or animal. So we are seeking people 
willing to do some word mining in the Interlingua dictionary(s) (I believe 
known as the IED), especially people who are willing to do a little 
checking to make sure the words are indeed the genus names. I asked Sunday 
night for volunteers from the Lojban community familiar with Interlingua 
and its dictionary, and failing that will seek help from the Interlingua 
community itself.

We will also systematically create cultural fu'ivla for all countries in 
the world, all languages that are distinct from country names having 
greater than N speakers (N ~ 1 million to 10 million, probably); for these 
we will have to make the perhaps difficult effort to find out what the name 
of these languages and countries are in the native language, which will 
take some research. Multiple people can work on this and it can be done 
using the CVS approach.

We will then add in any scientific words from the Interlingua wordmining, 
again assuming that the Latinate "prototype" wordform that defines the word 
in that language is probably the most international form we can find. 
Making the words into valid fu'ivla and coming up with a consistent format 
for definitions will be the final step, but these will be relatively easy 
steps once the words are assembled because of the limited and regular 
semantics.

The lujvo will be the hardest project. The biggest accomplishment of the 
weekend is that Nick and Nora and Shawn Lasseter went through the entire 
list of lujvo used prior to January 2000 and either assigned keywords, or 
marked for research every one of them. About 30% of the words need to be 
looked up for context, which means that we have around 3500 more lujvo 
keyworded than we had before with 1500 potentially to be looked up. I also 
have around 1000 more lujvo used for the first time during 2000 which have 
not been done. These additional words will be keyworded and looked up in 
decreasing frequency of usage order "until we are sick of it", probably 
cutting off at the 10 or 5 usages level. The new searchable archive of 
Lojban List is proving extremely helpful in making the context searches, 
and I am hoping that the yahoogroups archives can be added in to that 
archive somehow to make things even easier for newly made words.

The hard task will of course be place structures. We have of course got 
Nick's prior efforts at place-structure making from 1994, as well as an 
automated effort to build place structures for conversion lujvo using se, 
te, etc. plus a gismu. Nick has suggested using a similar automated 
procedure to generate lujvo and place structures for the special cases 
based on nu, ka, ni, mau, tol, nau, gau, sim, etc. This may take care of a 
good chunk of the lujvo already made.

For the remaining place structures, Nick feels that we need to abandon our 
attempt at perfection and careful analysis for each word in the 
dictionary. Instead, we should have a series of code symbols or font 
coding to indicate the level of confidence that we have in the place 
structure, and also to include a code for the place structure writer, who 
will have more or less credibility based on his perceived knowledge of the 
language and amount of lujvo analysis done. We also will not try to have 
complete place structures for every word we put in the dictionary, using 
the symbol codes to show the level of incompleteness.

The keywords, of course, typically define the x1 of any lujvo. For those 
lujvo that are generally namelike - used only as concrete references in 
sumti, we expect to stop at x1, though I myself would like to have an 
attempt to determine an x2 for each where it makes sense, since we have few 
brivla that are only one-place, and I think that one-place brivla will tend 
to damage the predicate nature of the language unless the words really are 
one-placers conceptually. But even I agree that we should start with the 
single place for these.

Those lujvo that are verblike, or which can translate as English verbs, 
should have complete place structures worked out. These lujvo will be the 
workhorses of the bridi structures in the language. We'll start with at 
least two places, and add oblique places if they are identifiable.

As with keywording, place structure work will be prioritized to emphasize 
words with higher frequency counts. We will use whatever net-based tools 
such as CVS as people think are best suited.

BUT, we will put extremely low priority on including new proposed 
lujvo. We simply have to draw the line somewhere. Proposals that have 
complete place structures may be considered, if they have been compiled in 
a single place, such as Arnt's collection in the lojban list file 
section. But we aren't going to be looking for these.

We also aren't going to spend a lot of time worrying about whether or not 
these place structures are correct and complete, the author's name will 
determine the credibility. I will cite the discussion (which I didn't read 
closely) on which of several words for computer program and or compiler as 
being the sort of thing we will likely ignore. Technical jargon is of 
lower interest for the dictionary anyway, and what seems like an argument 
between computer theorists and pragmatists looks like the makings of a 
"religious war" like Unix vs Windows or Emacs vs vi that we simply do not 
want to adjudicate. Again, completeness of the place structure will be 
relatively unimportant unless the word is seen to be verblike.

All lujvo will be marked to indicate whether they are complete or 
incomplete place structures, and we will also mark which place structures 
have been checked independently, and which place structures are from 1994 
because those are both complete and have stood the test of time.

As with the gismu, we will use automated tools as much as possible to 
generate the English-Lojban entires for lujvo and fu'ivla.

All this is still a bundle of work, but having broken down it several 
smaller tasks and having set relative priorities, the NickFest participants 
at least now think that the dictionary project is do-able.

Other tasks to be worked on:

I intend by the time the level 0 book comes out to have a new issue of le 
lojbo karni to be shipped with it. If I get the address lists updated, 
then I will send it to everyone on the list, but le lojbo karni WILL resume 
publication with the level 0 package, regardless of whether we can properly 
support it as a subscription 'zine. It will also be published 
simultaneously on the net, and people who ask will be able to specify that 
they do not want to receive a printed copy when we are able to send them 
out (but don't tell me now). Frequency is yet to be determined, but it 
will probably be every 3 or 6 months.

Resuming JL will be a little harder, but we need to figure out what to do 
with it. We have some 130 subscribers to whom we owe at least one prepaid 
subscription issue, most of those having 3 or 4 issues left.

JL used to be composed primarily of Lojban List writings edited lightly, 
and mixed with some snailmail stuff that was sent directly to me. These 
days, almost all material appears on the net, and the most active 
Lojbanists, who read the list, have read more of it than I have. This 
makes it difficult to determine what content to put into a JL that will 
probably be read mostly by the people who are on Lojban List.

We can produce a few issues by mining the archives for the 8 years since 
the last JL issue - most list subscribers were not on the list more than a 
year or two ago, and the archives are so immense that a good editorial 
selection of the "best of the old list" will serve as a good core for JL, 
especially if edited for currency. We can also ask for articles and Lojban 
texts that have not been posted to the list once we get close to 
publication, and we will undoubtedly get some, if not enough to fill an 
issue. Probably after 3 or 4 issues, we will run low on good archive 
material, but there will be more sense in the community of what kind of 
stuff to write for publication, but we'll have to play it by ear.

But resuming JL will probably not take place this year. Too many higher 
priorities. next year seems more plausible.

Finally, Nick has asked that we again call for a volunteer who can take his 
old "Adventure" (colossal cave) text translation and generate a Lojban 
version of the game. I have in the past noted that there is a system 
called "Inform" which is now used to generate adventure games easily for 
all platforms, and it has specific instructions for creating 
language-specific versions. There is an entire newsgroup dedicated to 
interactive-fiction writing using Inform and other tools. Someone can find 
a version of the colossal cave adventure using Inform and use Nick's 
translation to complete it. I also have an old hardcoded version of the 
program, I believe in some form of Basic, that can be modified by a good 
programmer; I started several years ago, and don't remeber how far I got, 
but I was hampered by the need to figure out what the code was doing - I 
think it was poorly commented. This old version had some portions of the 
adventure hard-coded into the program (including the command set, which 
should be in Lojban with a "ko " prompt so that the user is forced to enter 
imperative commands), and that hardcoded part has not been translated, but 
the data files for the Basic program is what Nick completed. The inform 
version probably has all the hard coded stuff in data files, but I haven't 
studied this.

The job should not be that difficult for anyone with programming 
experience, and it might not take much Lojban expertise. The person who 
does this will likely learn how to use the Inform tool and can then 
coordinate translations of other adventure games (there are hundreds of 
them out there in a single repository file site in Germany and it has 
become a significant creative genre with annual competitions - plenty of 
good Lojban-learning experiences) or can attempt to write new ones.

Because this task has languished so long and because the potential is high 
for getting lots of good stuff going, I would like at least two people to 
volunteer to work on this either together or independently (the inform work 
can easily be done independently of any attempt to fix the Basic program).

Lots done. Lots to do. Hope some people are inspired to do some work by 
this note.

lojbab
--
lojbab lojbab@lojban.org
Bob LeChevalier, President, The Logical Language Group, Inc.
2904 Beau Lane, Fairfax VA 22031-1303 USA 703-385-0273
Artificial language Loglan/Lojban: http://www.lojban.org