[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

NickFest 2



Nick Nicholas visited Nora and myself last weekend, and as with NickFest 1 last December, we had an eruption of Lojban activity that is rarely seen even in the active community of today. A lot got done on several backed-up projects, and some tentative decisions were made, which we hope that the community will agree with. We also have several tasks that we would like to farm out to volunteers in the community. Don't hesitate to volunteer.

As he has reported, Nick has taken over the "level 0 book" project that has long been languishing, and in the process it has become a significantly more robust if ambitious project. The original point of the project was to prepare a new form of level 0 (new Lojbanist) package which could either be printed relatively cheaply in mass-market book form, or which could be prepared via print-on-demand.

The existing package 0 consists of various Xeroxed documents: the brochure, the Overview, the Diagrammed Summary and the lastest issue of le lojbo karni. None of these documents have been revised in several years, and they contain minor out-of-date bits while having a second-rate appearance given modern technology. The now ancient newsletter, last put out in 1994, makes the project look moribund just by its date (people have complained because the official web site has not been updated in several months, but imagine the impression when our latest "news"letter is 7 years old. And finally, while the Diagrammed Summary can get people started in learning the language, we have always wanted to include the Lojban minilesson, originally written by Athelstan in 1991, but never revised because his near-fatal auto accident rendered him incapable of such Lojban work. The final straw in paralyzing our outreach to new Lojbanists has been that this hodgepodge of documents is expensive to produce and mail, and somewhat time-consuming to continually Xerox and assemble into "packages" for shipping, and we seldom took in the estimated $5 it costs to produce and ship them to USA correspondents, much less prospective Lojbanists from overseas.

The new level 0 package was to be a single book containing all of the relevant material except a newsletter, and as of last December, including a mildly revised version of Robin Turners introductory Lojban lessons, which the community seems to feel are at least as good as the old minilesson. John Cowan, our proven author, was going to take charge of this, but hasn't had the time. Enter Nick Nicholas, our fluent and again active Lojbanist who visited us in December and in a couple of days had moved the project a giant step forward.

Nick had to take a few months to finish his own book, and as he has announced in the last couple of weeks, has now launched into one of his patented high productivity efforts on Lojban that in the past produced the existing lujvo list, the conventions for place structures in the Book, and an enormous quantity of good Lojban text back in the era when few could produce a good paragraph in Lojban. Not to mention becoming the first recognized fluent Lojban speaker, though he has recently claimed that Goran Topic was fluent before he was.

Nick took advantage of a visit to the East Coast for other reasons, to drop in last weekend. Hence this note.

Nick has done a lot of work on the level 0 package, produced revisions of the first 4 of Robin Turner's lessons while identifying what will probably total around 12 lessons in the final package, rewritten the brochure, identified other items to be included in the book and assembled them. Nora has reviewed some of Nick's work as well as all of Robin's prior work, and the two of them worked hard on moving the effort along, with lojbab trying to stay out of the way (as dictated by the membership last LogFest, since they rightly felt that I was too over-committed to take on this effort).

Looking at the 4 lessons completed however, led us to realize that the 12 lessons when printed in a small "mass-market" paperback size as we intended, would come close to 200 pages by themselves, and the rest of the level 0 package while less predictable because it is not well formatted, will likely be almost as long. A 400 page paperback is NOT the sort of thing we want to send to casual inquirers about the language, and would likely be too expensive. So the first tentative decision is to split the level 0 package into two books, one of which describes the language and the project, and the other being the introductory lessons, which standing alone will serve as a "light" textbook of the sort made famous in the "Teach Yourself" language series.

Nick's style of intense activity, and a rather short window that he has available to work on the project, make it likely that these two books will be completed in draft form before LogFest 2001, which right now is looking like it will be the last weekend in July (but more on this in another message), with publication soon after that. This will depend on people being speedy with their comments on the material Nick puts up on his site.

That will be the new level 0 package (we will likely be changing the names of these packages to ensure that there is no confusion).

The level 1 package has traditionally been a set of wordlists and the E-BNF. We pretty much have all the pieces needed to edit these lists into another book, which will be a "pocket dictionary" of Lojban. We aren't seeking a lot of new material (no more lujvo), though I want to do something to improve the cmavo list. Depending on time and money, this could come out by the end of this year.

The LogFest members meeting will decide whether the pocket dictionary and the intro lessons will constitute the baseline dictionary and textbook, starting the infamous "5 year freeze period", or whether we should wait until we publish a full dictionary and more thorough textbook, which projects may start moving along once these other books are done. I will admit to wanting to have the full package for the 5 year period myself, but it is not solely my decision. You should speak up before LogFest.

The level 2 package has been supplanted by the reference grammar, which has now hit 360 sales. There is a good chance of paying off the printer loan on the book this year (allowing publication of other books), and perhaps reaching the break-even point on that book next year - probably at around 500-600 books depending on how one determines the break even sales level.

The level 3 package will be the full dictionary, which was the other major project moved forward during Nick's visit. I'll summarize that status by section.

The gismu list is of course baselined, and the English Lojban version of that list is nearly done and available as the draft dictionary file on the website "ENGDICT.GIS". I have a few English words (the most common ones) left unfinished, and the file needs to be weeded of duplicate entries, but these tasks can be completed with another good several-week burst of activity on my part.

A major hold up on the dictionary has been deciding what to do about cmavo, and this also affects the pocket dictionary. The existing cmavo list does not actually define most of the words, and the keywords were designed for LogFlash, not as proper definitions. We thus have three tasks, and I am willing to farm these out to volunteers in whole or in part.

For the Lojban-English side, we need for each cmavo and compound, English keywords if appropriate, a short definition if the word is definable, the selma'o (already there), and a pointer to the reference grammar section(s) discussing the word or the selma'o. Even a Lojban beginner with a copy of the book can work on the latter (and you might learn a lot about the language in the process), since it mostly involves looking stuff up. But don't volunteer unless you think you can commit enough time to do most or all of the either the lookup or the definition task within the next few months on your own - we can't afford a coordinator, and even the CVS option that Robin is working on seems inappropriate for this because consistency in style and look-up strategy/coverage is important and editing that sort of thing done by several people might take as long as doing it ourselves.

The second task is to go through the rather large accumulation of cmavo compounds that have actually been used and decide which of them have a simple English definition, and prepare them as per the above with keywords, definitions and book references. The existing set of compounds in the cmavo list was determined arbitrarily when we set up LogFlash and there are hundreds of other compounds to be considered. We'll concentrate on the most frequently used. A compound like "lenu" will either be skipped, or defined simply as le + nu possibly with a refgrammar reference. I can do a first cut at weeding these, or someone can volunteer, but there is no sense in starting this while the current cmavo list remains unfinished.

The third task will be to prepare English to Lojban entries for cmavo. Part of this job will use the results of the above tasks - using keyword processing as we used for the gismu list, and formatting the resulting entries to look like the others.

In addition we will seek a list of English "function words" which most likely will map to cmavo, and for each one, prepare the following: a list of any Lojban cmavo or other valsi or phrases that might be considered in translating the function word as well as any exact Lojban equivalents, and a pointer to the refgrammar if relevant. We may use JCB's prior TLI Loglan work to get ideas for English words to be included. This will be a manual, fairly time-consuming process, but is really necessary for the full dictionary though not vital for the pocket dictionary.

The next area to be worked on are fu'ivla, which have hardly been tackled. Nick advocates our collecting a fairly large set of easily made fu'ivla for plants and animals, (and perhaps other common international science words and foods), using the Linnean genus for each animal in the Latin ablative case (which gives a consistent vowel ending.

Nick believes that in most cases this can be made trivially by finding out Gode's Interlingua word for the plant or animal. So we are seeking people willing to do some word mining in the Interlingua dictionary(s) (I believe known as the IED), especially people who are willing to do a little checking to make sure the words are indeed the genus names. I asked Sunday night for volunteers from the Lojban community familiar with Interlingua and its dictionary, and failing that will seek help from the Interlingua community itself.

We will also systematically create cultural fu'ivla for all countries in the world, all languages that are distinct from country names having greater than N speakers (N ~ 1 million to 10 million, probably); for these we will have to make the perhaps difficult effort to find out what the name of these languages and countries are in the native language, which will take some research. Multiple people can work on this and it can be done using the CVS approach.

We will then add in any scientific words from the Interlingua wordmining, again assuming that the Latinate "prototype" wordform that defines the word in that language is probably the most international form we can find. Making the words into valid fu'ivla and coming up with a consistent format for definitions will be the final step, but these will be relatively easy steps once the words are assembled because of the limited and regular semantics.

The lujvo will be the hardest project. The biggest accomplishment of the weekend is that Nick and Nora and Shawn Lasseter went through the entire list of lujvo used prior to January 2000 and either assigned keywords, or marked for research every one of them. About 30% of the words need to be looked up for context, which means that we have around 3500 more lujvo keyworded than we had before with 1500 potentially to be looked up. I also have around 1000 more lujvo used for the first time during 2000 which have not been done. These additional words will be keyworded and looked up in decreasing frequency of usage order "until we are sick of it", probably cutting off at the 10 or 5 usages level. The new searchable archive of Lojban List is proving extremely helpful in making the context searches, and I am hoping that the yahoogroups archives can be added in to that archive somehow to make things even easier for newly made words.

The hard task will of course be place structures. We have of course got Nick's prior efforts at place-structure making from 1994, as well as an automated effort to build place structures for conversion lujvo using se, te, etc. plus a gismu. Nick has suggested using a similar automated procedure to generate lujvo and place structures for the special cases based on nu, ka, ni, mau, tol, nau, gau, sim, etc. This may take care of a good chunk of the lujvo already made.

For the remaining place structures, Nick feels that we need to abandon our attempt at perfection and careful analysis for each word in the dictionary. Instead, we should have a series of code symbols or font coding to indicate the level of confidence that we have in the place structure, and also to include a code for the place structure writer, who will have more or less credibility based on his perceived knowledge of the language and amount of lujvo analysis done. We also will not try to have complete place structures for every word we put in the dictionary, using the symbol codes to show the level of incompleteness.

The keywords, of course, typically define the x1 of any lujvo. For those lujvo that are generally namelike - used only as concrete references in sumti, we expect to stop at x1, though I myself would like to have an attempt to determine an x2 for each where it makes sense, since we have few brivla that are only one-place, and I think that one-place brivla will tend to damage the predicate nature of the language unless the words really are one-placers conceptually. But even I agree that we should start with the single place for these.

Those lujvo that are verblike, or which can translate as English verbs, should have complete place structures worked out. These lujvo will be the workhorses of the bridi structures in the language. We'll start with at least two places, and add oblique places if they are identifiable.

As with keywording, place structure work will be prioritized to emphasize words with higher frequency counts. We will use whatever net-based tools such as CVS as people think are best suited.

BUT, we will put extremely low priority on including new proposed lujvo. We simply have to draw the line somewhere. Proposals that have complete place structures may be considered, if they have been compiled in a single place, such as Arnt's collection in the lojban list file section. But we aren't going to be looking for these.

We also aren't going to spend a lot of time worrying about whether or not these place structures are correct and complete, the author's name will determine the credibility. I will cite the discussion (which I didn't read closely) on which of several words for computer program and or compiler as being the sort of thing we will likely ignore. Technical jargon is of lower interest for the dictionary anyway, and what seems like an argument between computer theorists and pragmatists looks like the makings of a "religious war" like Unix vs Windows or Emacs vs vi that we simply do not want to adjudicate. Again, completeness of the place structure will be relatively unimportant unless the word is seen to be verblike.

All lujvo will be marked to indicate whether they are complete or incomplete place structures, and we will also mark which place structures have been checked independently, and which place structures are from 1994 because those are both complete and have stood the test of time.

As with the gismu, we will use automated tools as much as possible to generate the English-Lojban entires for lujvo and fu'ivla.

All this is still a bundle of work, but having broken down it several smaller tasks and having set relative priorities, the NickFest participants at least now think that the dictionary project is do-able.

Other tasks to be worked on:

I intend by the time the level 0 book comes out to have a new issue of le lojbo karni to be shipped with it. If I get the address lists updated, then I will send it to everyone on the list, but le lojbo karni WILL resume publication with the level 0 package, regardless of whether we can properly support it as a subscription 'zine. It will also be published simultaneously on the net, and people who ask will be able to specify that they do not want to receive a printed copy when we are able to send them out (but don't tell me now). Frequency is yet to be determined, but it will probably be every 3 or 6 months.

Resuming JL will be a little harder, but we need to figure out what to do with it. We have some 130 subscribers to whom we owe at least one prepaid subscription issue, most of those having 3 or 4 issues left.

JL used to be composed primarily of Lojban List writings edited lightly, and mixed with some snailmail stuff that was sent directly to me. These days, almost all material appears on the net, and the most active Lojbanists, who read the list, have read more of it than I have. This makes it difficult to determine what content to put into a JL that will probably be read mostly by the people who are on Lojban List.

We can produce a few issues by mining the archives for the 8 years since the last JL issue - most list subscribers were not on the list more than a year or two ago, and the archives are so immense that a good editorial selection of the "best of the old list" will serve as a good core for JL, especially if edited for currency. We can also ask for articles and Lojban texts that have not been posted to the list once we get close to publication, and we will undoubtedly get some, if not enough to fill an issue. Probably after 3 or 4 issues, we will run low on good archive material, but there will be more sense in the community of what kind of stuff to write for publication, but we'll have to play it by ear.

But resuming JL will probably not take place this year. Too many higher priorities. next year seems more plausible.

Finally, Nick has asked that we again call for a volunteer who can take his old "Adventure" (colossal cave) text translation and generate a Lojban version of the game. I have in the past noted that there is a system called "Inform" which is now used to generate adventure games easily for all platforms, and it has specific instructions for creating language-specific versions. There is an entire newsgroup dedicated to interactive-fiction writing using Inform and other tools. Someone can find a version of the colossal cave adventure using Inform and use Nick's translation to complete it. I also have an old hardcoded version of the program, I believe in some form of Basic, that can be modified by a good programmer; I started several years ago, and don't remeber how far I got, but I was hampered by the need to figure out what the code was doing - I think it was poorly commented. This old version had some portions of the adventure hard-coded into the program (including the command set, which should be in Lojban with a "ko " prompt so that the user is forced to enter imperative commands), and that hardcoded part has not been translated, but the data files for the Basic program is what Nick completed. The inform version probably has all the hard coded stuff in data files, but I haven't studied this.

The job should not be that difficult for anyone with programming experience, and it might not take much Lojban expertise. The person who does this will likely learn how to use the Inform tool and can then coordinate translations of other adventure games (there are hundreds of them out there in a single repository file site in Germany and it has become a significant creative genre with annual competitions - plenty of good Lojban-learning experiences) or can attempt to write new ones.

Because this task has languished so long and because the potential is high for getting lots of good stuff going, I would like at least two people to volunteer to work on this either together or independently (the inform work can easily be done independently of any attempt to fix the Basic program).

Lots done. Lots to do. Hope some people are inspired to do some work by this note.

lojbab
--
lojbab                                             lojbab@lojban.org
Bob LeChevalier, President, The Logical Language Group, Inc.
2904 Beau Lane, Fairfax VA 22031-1303 USA                    703-385-0273
Artificial language Loglan/Lojban:                 http://www.lojban.org