Received: from mail-la0-f58.google.com ([209.85.215.58]:34788) by stodi.digitalkingdom.org with esmtps (TLSv1:RC4-SHA:128) (Exim 4.80.1) (envelope-from ) id 1YIE4A-00024k-Mt for lojban-list-archive@lojban.org; Mon, 02 Feb 2015 02:17:15 -0800 Received: by mail-la0-f58.google.com with SMTP id gd6sf2452922lab.3 for ; Mon, 02 Feb 2015 02:17:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20120806; h=mime-version:from:date:message-id:subject:to:content-type :x-original-sender:x-original-authentication-results:reply-to :precedence:mailing-list:list-id:list-post:list-help:list-archive :sender:list-subscribe:list-unsubscribe; bh=Y4gATAUKY3CQeFYQgACaUcDTFVE4Q8TY+4HlIcxW6Ng=; b=LxZE++pWPlHP72zYRnS5Jiswc3IzdZxg+lRSyDIGbjFn1mbJ54OoDtNPATjprhbBgv FF5I4e08E0qWWCF7p2U/SjO8k3H/jqbm0trhP3tJ1W+yKSdvbE+QHcazNsuySq48HSVG sokMCOvyi8tCmMGRzdjgAewqAa2P9k1zT4WO+2/ePDZvir5iCCk3B6dyWExjZyNleCbW nfrekod+MY/mh/zd1wEbQoJytncinuVjZMv7hep9PjXsLN7I0BwMb2eh56w6nyvj61EG caG+NA8SKfh/l+clbjH6s6yK5N0wP9LmsPGbjojNFi/dMNrlXhti0kf5TKzmb5vnUqpm ZlaQ== X-Received: by 10.152.36.106 with SMTP id p10mr45608laj.39.1422872227660; Mon, 02 Feb 2015 02:17:07 -0800 (PST) X-BeenThere: lojban@googlegroups.com Received: by 10.152.44.162 with SMTP id f2ls638542lam.5.gmail; Mon, 02 Feb 2015 02:17:06 -0800 (PST) X-Received: by 10.112.163.42 with SMTP id yf10mr2213095lbb.8.1422872226583; Mon, 02 Feb 2015 02:17:06 -0800 (PST) Received: from mail-we0-x22c.google.com (mail-we0-x22c.google.com. [2a00:1450:400c:c03::22c]) by gmr-mx.google.com with ESMTPS id cl5si1177154wib.3.2015.02.02.02.17.06 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 02 Feb 2015 02:17:06 -0800 (PST) Received-SPF: pass (google.com: domain of gleki.is.my.name@gmail.com designates 2a00:1450:400c:c03::22c as permitted sender) client-ip=2a00:1450:400c:c03::22c; Received: by mail-we0-f172.google.com with SMTP id q59so37993732wes.3 for ; Mon, 02 Feb 2015 02:17:06 -0800 (PST) X-Received: by 10.194.190.111 with SMTP id gp15mr12355343wjc.132.1422872226243; Mon, 02 Feb 2015 02:17:06 -0800 (PST) MIME-Version: 1.0 Received: by 10.194.86.200 with HTTP; Mon, 2 Feb 2015 02:16:45 -0800 (PST) From: Gleki Arxokuna Date: Mon, 2 Feb 2015 13:16:45 +0300 Message-ID: Subject: [lojban] Dictionary. Stage 2. Variable types interactions and anti-hermeneutics To: "lojban@googlegroups.com" Content-Type: multipart/alternative; boundary=047d7bb04cdc9b2df1050e18414f X-Original-Sender: gleki.is.my.name@gmail.com X-Original-Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of gleki.is.my.name@gmail.com designates 2a00:1450:400c:c03::22c as permitted sender) smtp.mail=gleki.is.my.name@gmail.com; dkim=pass header.i=@gmail.com; dmarc=pass (p=NONE dis=NONE) header.from=gmail.com Reply-To: lojban@googlegroups.com Precedence: list Mailing-list: list lojban@googlegroups.com; contact lojban+owners@googlegroups.com List-ID: X-Google-Group-Id: 1004133512417 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , X-Spam-Score: 0.8 (/) X-Spam_score: 0.8 X-Spam_score_int: 8 X-Spam_bar: / X-Spam-Report: Spam detection software, running on the system "stodi.digitalkingdom.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see @@CONTACT_ADDRESS@@ for details. Content preview: We assume that Stage 0 was publishing by LLG initial gismu.txt and cmavo.txt wordlists. The following Stage 1 of writing the Dictionary (link , most of the discussion is by xorxes and gleki) showed A. which te sumti variable types should exist in Lojban B. how they interact. [...] Content analysis details: (0.8 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [URIs: lojban.org] 2.7 DNS_FROM_AHBL_RHSBL RBL: Envelope sender listed in dnsbl.ahbl.org [listed in googlegroups.com.rhsbl.ahbl.org. IN] [A] -0.0 RCVD_IN_MSPIKE_H3 RBL: Good reputation (+3) [209.85.215.58 listed in wl.mailspike.net] 0.0 T_HEADER_FROM_DIFFERENT_DOMAINS From and EnvelopeFrom 2nd level mail domains are different -0.0 SPF_PASS SPF: sender matches SPF record 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (gleki.is.my.name[at]gmail.com) 0.0 DKIM_ADSP_CUSTOM_MED No valid author signature, adsp_override is CUSTOM_MED 0.0 HTML_MESSAGE BODY: HTML included in message -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid 0.0 T_FREEMAIL_FORGED_FROMDOMAIN 2nd level domains in From and EnvelopeFrom freemail headers are different -0.0 RCVD_IN_MSPIKE_WL Mailspike good senders --047d7bb04cdc9b2df1050e18414f Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable We assume that Stage 0 was publishing by LLG initial gismu.txt and cmavo.txt wordlists. The following Stage 1 of writing the Dictionary (link , most of the discussion is by xorxes and gleki) showed A. which te sumti variable types should exist in Lojban B. how they interact. Here, at stage 2 we deal with the following tasks: 1. rethink variable types system based on drawbacks of the one from Stage 1. Find rules for resolving variable types conflicts (assigning a value of one type to a te sumti of another type; aka =E2=80=9Csumti-raising/lowering= =E2=80=9D etc.) 2. polish out specifying te sumti interactions within every given brivla 3. rewrite definitions of most important cmavo (ignoring less used cmavo). ignore rarely used sumti based on =E2=80=9Comitted sumti is {zo=E2=80=99e j= a zi=E2=80=99o}, not {zo=E2=80=99e}=E2=80=9D assumption, add useful place keywords (translating = them as nouns or adjectives) 4. provide usage examples for EVERY SUMTI of EVERY BRIVLA and for cmavo. 5. Using Google Spreadsheet formulae implement autogeneration of a print-ready dictionary from the spreadsheet. Make the spreadsheet more friendly for future translators of it to other languages. Stage 2 result: http://mw.lojban.org/lmw/La_Bangu:_one-page_vlaste Stage 2 explanation: http://mw.lojban.org/lmw/La_Bangu:_dictionary In detail: 1. =E2=80=9Cobject=E2=80=9D vs. =E2=80=9Cevent=E2=80=9D distinction didn=E2= =80=99t prove to be useful in brivla type system. It is gone. Instead, =E2=80=9Centity=E2=80=9D vs. =E2=80=9Ceve= nt=E2=80=9D is used which isn=E2=80=99t strictly semantic. =E2=80=9CApple=E2=80=9D is an =E2=80=9Centity=E2=80=9D. = =E2=80=9CWaterfall=E2=80=9D is an =E2=80=9Centity=E2=80=9D even if it is described as {lo nu lo djacu cu farlu}. Thus philosophical issues of object/event/property distinction are avoided here. 1a. =E2=80=9CEvent=E2=80=9D is a te sumti type that can accept only an abst= raction place. Conflicts are resolved as described on =E2=80=9CLa Bangu: dictionary=E2=80= =9D page. Even if {lo plise} can be described as a motion of elementary particles and thus as a process, nevertheless {mi djica lo plise} can never mean =E2=80=9CI want = a process that we call =E2=80=98apple=E2=80=99 =E2=80=9D. This is because {dj= ica2} is of =E2=80=9Cevent=E2=80=9D type and autocorrection according to the rule of =E2=80=9Cputting an entity sum= ti into an event te bridi=E2=80=9D takes place. Thus it is assumed to mean {mi djic= a lo nu lo plise cu co=E2=80=99e} (this is the most common example of resolving typ= e conflicts; this particular rule is otherwise known as =E2=80=9Cdealing with sumti-raising=E2=80=9D). In particular, this together with entity/event typ= e system also solves the problem of implied raising in dunda2 as opposed to vecnu2. 1b. For pragmatic purposes other minor types are used. Among them are =E2=80=9Cproposition=E2=80=9D (du=E2=80=99u-place), =E2=80=9Cproperty=E2=80= =9D (ka+ce=E2=80=99u place), =E2=80=9Ctaxon=E2=80=9D, =E2=80=9Csound=E2=80= =9D, =E2=80=9Ctext=E2=80=9D, =E2=80=9Cnumber=E2=80=9D, =E2=80=9Ccmavo class=E2= =80=9D. Orthogonal type is =E2=80=9Cplural=E2=80=9D. 1c. No place can take more than one type. If you see that (e.g. it can be both a =E2=80=9Cproperty=E2=80=9D and =E2=80=9Centity=E2=80=9D) then it mea= ns it can take only =E2=80=9Cproperty=E2=80=9D, and =E2=80=9Centity=E2=80=9D is the result of sumti-lowering. Example: {mi = cirko lo ckiku} vs. {mi cirko lo ka ce=E2=80=99u kanro}. 2. the dictionary now explains how te sumti interact within te bridi array; this mostly happens via =E2=80=9Cka+ce=E2=80=99u=E2=80=9D places. If {kau} = is assumed in a place it is mentioned. =E2=80=9Cnonce property=E2=80=9D are places that have {ce=E2=80=99u} that r= efer to sumti that are not part of the place structure. E.g. in {mi pensi lo ka ce=E2=80=99u broda= } the link {ce=E2=80=99u} refers neither to pensi1, nor to any other known place = of pensi. 3. cmavo definitions have been rewritten according to common sense, removing cryptic words (as well as JCB=E2=80=99s pseudo-English legacy). BP= FK definitions from the tiki have been taken into account. 4. Anti-hermeneutics mechanism. Lojban is a lost language as shown by endless discussions in IRC and these mriste of what this or that word really means (a hermeneutics situati= on). Such discussions end either in =E2=80=9Cthis is the most useful interpretat= ion=E2=80=9D or =E2=80=9Cthis makes no sense=E2=80=9D. What the authors of gismu places rea= lly meant can probably no longer be known. Here at Stage 2 for every place of every brivla usage example has been provided. Usage examples of te bridi array elements missing at time of Stage 2 were forced to appear. Korpora Zei Sisku tool and FrameNet, British National Corpus, Tatoeba.org, help from various Lojbanists here in the three mriste and in the IRC channel has helped a lot to complete this task. 4a. No place in usage example should be filled with KOhA or LA/ZO sumti - this is a requirement for an example to be successful. If this requirement is not fulfilled this might be an indication that such place can=E2=80=99t = be filled with anything else. 4b. Exception: =E2=80=9Ctaxon=E2=80=9D te sumti don=E2=80=99t have usage ex= amples since they mark names of taxons and thus {la} is applicable there (Lojbanized Linnaean names). 5. As of now the source is in a google spreadsheet, definitions are assembled from such pieces as =E2=80=9Cx1,=E2=80=9Dx2=E2=80=9D,..., text be= tween them, from type declarations of each place (e.g. =E2=80=9C(entity)=E2=80=9D). Examples and = place keywords no matter how many of them are joined with their translations and attached to the definitions. Similarly, for cmavo. The result is then displayed on a separate list in a mediawiki-friendly format so that it can be easily pasted to a mediawiki page as shown in the link above. 5a. Luckily, no macros/scripts are needed. Embedded default spreadsheet functions are enough. CONCATENATE, IF, OR, VLOOKUP, REGEXREPLACE are among most frequent functions generating the Dictionary. 5b. A special URL can be generated showing the latest version of the Dictinary. Future work: This dictionary isn=E2=80=99t an official project, it is a trade-off betwee= n official wordlists, CLL, later BPFK work, IRC community live usage, and the level of coverage of the semantic space. 01. For 99% of the language we now have at least one opinion so that any clarification or a rival opinion on a given usage example, any te bridi array element, glosswords, definition etc. can now be listened to and pushed into the dictionary. 02. New output formats can now be suggested apart from mediawiki, e.g. latex. Improvements to the existing output can now be suggested. --=20 You received this message because you are subscribed to the Google Groups "= lojban" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to lojban+unsubscribe@googlegroups.com. To post to this group, send email to lojban@googlegroups.com. Visit this group at http://groups.google.com/group/lojban. For more options, visit https://groups.google.com/d/optout. --047d7bb04cdc9b2df1050e18414f Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
We assume that Stage 0 was publishing by LLG initial = gismu.txt and cmavo.txt wordlists.
The following Stage 1 of writi= ng the Dictionary (link, most of the discussion is b= y xorxes and gleki) showed=C2=A0
A. which te sumti variable types= should exist in Lojban
B. how they interact.

Here, at stage 2 we deal with the following tasks:
1. rethi= nk variable types system based on drawbacks of the one from Stage 1. Find r= ules for resolving variable types conflicts (assigning a value of one type = to a te sumti of another type; aka =E2=80=9Csumti-raising/lowering=E2=80=9D= etc.)
2. polish out specifying te sumti interactions within ever= y given brivla
3. rewrite definitions of most important cmavo (ig= noring less used cmavo). ignore rarely used sumti based on =E2=80=9Comitted= sumti is {zo=E2=80=99e ja zi=E2=80=99o}, not {zo=E2=80=99e}=E2=80=9D assum= ption, add useful place keywords (translating them as nouns or adjectives)<= /div>
4. provide usage examples for EVERY SUMTI of EVERY BRIVLA and for= cmavo.
5. Using Google Spreadsheet formulae implement autogenera= tion of a print-ready dictionary from the spreadsheet. Make the spreadsheet= more friendly for future translators of it to other languages.

In detail:
1. =E2=80=9Cobject=E2= =80=9D vs. =E2=80=9Cevent=E2=80=9D distinction didn=E2=80=99t prove to be u= seful in brivla type system. It is gone. Instead, =E2=80=9Centity=E2=80=9D = vs. =E2=80=9Cevent=E2=80=9D is used which isn=E2=80=99t strictly semantic. = =E2=80=9CApple=E2=80=9D is an =E2=80=9Centity=E2=80=9D. =E2=80=9CWaterfall= =E2=80=9D is an =E2=80=9Centity=E2=80=9D even if it is described as {lo nu = lo djacu cu farlu}. Thus philosophical issues of object/event/property dist= inction are avoided here.
1a. =E2=80=9CEvent=E2=80=9D is a te sum= ti type that can accept only an abstraction place. Conflicts are resolved a= s described on =E2=80=9CLa Bangu: dictionary=E2=80=9D page. Even if {lo pli= se} can be described as a motion of elementary particles and thus as a proc= ess, nevertheless {mi djica lo plise} can never mean =E2=80=9CI want a proc= ess that we call =E2=80=98apple=E2=80=99 =E2=80=9D. This is because {djica2= } is of =E2=80=9Cevent=E2=80=9D type and autocorrection =C2=A0according to = the rule of =E2=80=9Cputting an entity sumti into an event te bridi=E2=80= =9D takes place. Thus it is assumed to mean {mi djica lo nu lo plise cu co= =E2=80=99e} (this is the most common example of resolving type conflicts; t= his particular rule is otherwise known as =E2=80=9Cdealing with sumti-raisi= ng=E2=80=9D). In particular, this together with entity/event type system al= so solves the problem of implied raising in dunda2 as opposed to vecnu2.
1b. For pragmatic purposes other minor types are used. Among them a= re =E2=80=9Cproposition=E2=80=9D (du=E2=80=99u-place), =E2=80=9Cproperty=E2= =80=9D (ka+ce=E2=80=99u place), =E2=80=9Ctaxon=E2=80=9D, =E2=80=9Csound=E2= =80=9D, =E2=80=9Ctext=E2=80=9D, =E2=80=9Cnumber=E2=80=9D, =E2=80=9Ccmavo cl= ass=E2=80=9D. Orthogonal type is =E2=80=9Cplural=E2=80=9D.
1c. No= place can take more than one type. If you see that (e.g. it can be both a = =E2=80=9Cproperty=E2=80=9D and =E2=80=9Centity=E2=80=9D) then it means it c= an take only =E2=80=9Cproperty=E2=80=9D, and =E2=80=9Centity=E2=80=9D is th= e result of sumti-lowering. Example: {mi cirko lo ckiku} vs. {mi cirko lo k= a ce=E2=80=99u kanro}.
2. the dictionary now explains how te sumt= i interact within te bridi array; this mostly happens via =E2=80=9Cka+ce=E2= =80=99u=E2=80=9D places. If {kau} is assumed in a place it is mentioned.
=E2=80=9Cnonce property=E2=80=9D are places that have {ce=E2=80=99u= } that refer to sumti that are not part of the place structure. E.g. in {mi= pensi lo ka ce=E2=80=99u broda} the link {ce=E2=80=99u} refers neither to = pensi1, nor to any other known place of pensi.
3. cmavo definitio= ns have been rewritten according to common sense, removing cryptic words (a= s well as JCB=E2=80=99s pseudo-English legacy). BPFK definitions from the t= iki have been taken into account.
4. Anti-hermeneutics mechanism.= Lojban is a lost language as shown by endless discussions in IRC and these= mriste of what this or that word really means (a hermeneutics situation). Suc= h discussions end either in =E2=80=9Cthis is the most useful interpretation= =E2=80=9D or =E2=80=9Cthis makes no sense=E2=80=9D. What the authors of gis= mu places really meant can probably no longer be known. Here at Stage 2 for= every place of every brivla usage example has been provided. Usage example= s of te bridi array elements missing at time of Stage 2 were forced to appe= ar. Korpora Zei Sisku tool and FrameNet, British National Corpus, Tatoeba.o= rg, help from various Lojbanists here in the three mriste and in the IRC ch= annel has helped a lot to complete this task.
4a. No place in usa= ge example should be filled with KOhA or LA/ZO sumti =C2=A0- this is a requ= irement for an example to be successful. If this requirement is not fulfill= ed this might be an indication that such place can=E2=80=99t be filled with= anything else.
4b. Exception: =E2=80=9Ctaxon=E2=80=9D te sumti d= on=E2=80=99t have usage examples since they mark names of taxons and thus {= la} is applicable there (Lojbanized Linnaean names).
5. As of now= the source is in a google spreadsheet, definitions are assembled from such= pieces as =E2=80=9Cx1,=E2=80=9Dx2=E2=80=9D,..., text between them, from ty= pe declarations of each place (e.g. =E2=80=9C(entity)=E2=80=9D). Examples a= nd place keywords no matter how many of them are joined with their translat= ions and attached to the definitions. Similarly, for cmavo. The result is t= hen displayed on a separate list in a mediawiki-friendly format so that it = can be easily pasted to a mediawiki page as shown in the link above.
<= div>5a. Luckily, no macros/scripts are needed. Embedded default spreadsheet= functions are enough. CONCATENATE, IF, OR, VLOOKUP, REGEXREPLACE are among= most frequent functions generating the Dictionary.
5b. A special= URL can be generated showing the latest version of the Dictinary.
Future work:
This dictionary isn=E2=80=99t an official project,= it is a trade-off between official wordlists, CLL, later BPFK work, IRC co= mmunity live usage, and the level of coverage of the semantic space.
<= div>01. For 99% of the language we now have at least one opinion so that an= y clarification or a rival opinion on a given usage example, any te bridi a= rray element, glosswords, definition etc. can now be listened to and pushed= into the dictionary.
02. New output formats can now be suggested= apart from mediawiki, e.g. latex. Improvements to the existing output can = now be suggested.

--
You received this message because you are subscribed to the Google Groups &= quot;lojban" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to lojban+unsub= scribe@googlegroups.com.
To post to this group, send email to lojban@googlegroups.com.
Visit this group at http:= //groups.google.com/group/lojban.
For more options, visit http= s://groups.google.com/d/optout.
--047d7bb04cdc9b2df1050e18414f--