[lojban] Dictionary. Stage 2. Variable types interactions and anti-hermeneutics

We assume that Stage 0 was publishing by LLG initial gismu.txt and cmavo.txt wordlists.

The following Stage 1 of writing the Dictionary (link, most of the discussion is by xorxes and gleki) showed

A. which te sumti variable types should exist in Lojban

B. how they interact.

Here, at stage 2 we deal with the following tasks:

1. rethink variable types system based on drawbacks of the one from Stage 1. Find rules for resolving variable types conflicts (assigning a value of one type to a te sumti of another type; aka “sumti-raising/lowering” etc.)

2. polish out specifying te sumti interactions within every given brivla

3. rewrite definitions of most important cmavo (ignoring less used cmavo). ignore rarely used sumti based on “omitted sumti is {zo’e ja zi’o}, not {zo’e}” assumption, add useful place keywords (translating them as nouns or adjectives)

4. provide usage examples for EVERY SUMTI of EVERY BRIVLA and for cmavo.

5. Using Google Spreadsheet formulae implement autogeneration of a print-ready dictionary from the spreadsheet. Make the spreadsheet more friendly for future translators of it to other languages.

Stage 2 result: http://mw.lojban.org/lmw/La_Bangu:_one-page_vlaste

Stage 2 explanation: http://mw.lojban.org/lmw/La_Bangu:_dictionary

In detail:

1. “object” vs. “event” distinction didn’t prove to be useful in brivla type system. It is gone. Instead, “entity” vs. “event” is used which isn’t strictly semantic. “Apple” is an “entity”. “Waterfall” is an “entity” even if it is described as {lo nu lo djacu cu farlu}. Thus philosophical issues of object/event/property distinction are avoided here.

1a. “Event” is a te sumti type that can accept only an abstraction place. Conflicts are resolved as described on “La Bangu: dictionary” page. Even if {lo plise} can be described as a motion of elementary particles and thus as a process, nevertheless {mi djica lo plise} can never mean “I want a process that we call ‘apple’ ”. This is because {djica2} is of “event” type and autocorrection according to the rule of “putting an entity sumti into an event te bridi” takes place. Thus it is assumed to mean {mi djica lo nu lo plise cu co’e} (this is the most common example of resolving type conflicts; this particular rule is otherwise known as “dealing with sumti-raising”). In particular, this together with entity/event type system also solves the problem of implied raising in dunda2 as opposed to vecnu2.

1b. For pragmatic purposes other minor types are used. Among them are “proposition” (du’u-place), “property” (ka+ce’u place), “taxon”, “sound”, “text”, “number”, “cmavo class”. Orthogonal type is “plural”.

1c. No place can take more than one type. If you see that (e.g. it can be both a “property” and “entity”) then it means it can take only “property”, and “entity” is the result of sumti-lowering. Example: {mi cirko lo ckiku} vs. {mi cirko lo ka ce’u kanro}.

2. the dictionary now explains how te sumti interact within te bridi array; this mostly happens via “ka+ce’u” places. If {kau} is assumed in a place it is mentioned.

“nonce property” are places that have {ce’u} that refer to sumti that are not part of the place structure. E.g. in {mi pensi lo ka ce’u broda} the link {ce’u} refers neither to pensi1, nor to any other known place of pensi.

3. cmavo definitions have been rewritten according to common sense, removing cryptic words (as well as JCB’s pseudo-English legacy). BPFK definitions from the tiki have been taken into account.

4. Anti-hermeneutics mechanism. Lojban is a lost language as shown by endless discussions in IRC and these mriste of what this or that word really means (a hermeneutics situation). Such discussions end either in “this is the most useful interpretation” or “this makes no sense”. What the authors of gismu places really meant can probably no longer be known. Here at Stage 2 for every place of every brivla usage example has been provided. Usage examples of te bridi array elements missing at time of Stage 2 were forced to appear. Korpora Zei Sisku tool and FrameNet, British National Corpus, Tatoeba.org, help from various Lojbanists here in the three mriste and in the IRC channel has helped a lot to complete this task.

4a. No place in usage example should be filled with KOhA or LA/ZO sumti - this is a requirement for an example to be successful. If this requirement is not fulfilled this might be an indication that such place can’t be filled with anything else.

4b. Exception: “taxon” te sumti don’t have usage examples since they mark names of taxons and thus {la} is applicable there (Lojbanized Linnaean names).

5. As of now the source is in a google spreadsheet, definitions are assembled from such pieces as “x1,”x2”,..., text between them, from type declarations of each place (e.g. “(entity)”). Examples and place keywords no matter how many of them are joined with their translations and attached to the definitions. Similarly, for cmavo. The result is then displayed on a separate list in a mediawiki-friendly format so that it can be easily pasted to a mediawiki page as shown in the link above.

5a. Luckily, no macros/scripts are needed. Embedded default spreadsheet functions are enough. CONCATENATE, IF, OR, VLOOKUP, REGEXREPLACE are among most frequent functions generating the Dictionary.

5b. A special URL can be generated showing the latest version of the Dictinary.

Future work:

This dictionary isn’t an official project, it is a trade-off between official wordlists, CLL, later BPFK work, IRC community live usage, and the level of coverage of the semantic space.

01. For 99% of the language we now have at least one opinion so that any clarification or a rival opinion on a given usage example, any te bridi array element, glosswords, definition etc. can now be listened to and pushed into the dictionary.

02. New output formats can now be suggested apart from mediawiki, e.g. latex. Improvements to the existing output can now be suggested.