Received: from mail-pd0-f188.google.com ([209.85.192.188]:35679) by stodi.digitalkingdom.org with esmtps (TLSv1:RC4-SHA:128) (Exim 4.80.1) (envelope-from ) id 1WzUct-0007lc-RS for lojban-list-archive@lojban.org; Tue, 24 Jun 2014 10:35:25 -0700 Received: by mail-pd0-f188.google.com with SMTP id r10sf126405pdi.5 for ; Tue, 24 Jun 2014 10:35:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20120806; h=date:from:to:message-id:in-reply-to:references:subject:mime-version :x-original-sender:reply-to:precedence:mailing-list:list-id :list-post:list-help:list-archive:sender:list-subscribe :list-unsubscribe:content-type; bh=yqWptvPOQf8IGEwYA6rCJDJ4frH78hxQyzAKSjDhM5g=; b=dbtkicUGawARcGU5Y58IIWMX3CQsaJywLjupYf9krhXrmEKBWd43GS+svvQFymgVTn EaNEFh5f5fnTiSRvt6EMUWiw4KKdkwBBn/RMLu7uO5wb7OiGzVSRXL19hTqeF8O6nnI4 qxULRAER3ngWHBPT5bjJejbgcghKdKQ5b9NjAAC/fzGHaJO12j86ysfVW6L8EJXvOvE6 Bk+7Sg4ZAfS4Goeat44Kd+rRrTmIGvH/fdaYd6AJd1p2wmdd5je9mTRcyYirkdeJsS5l LKMrBA1mPciQ5cQbD3cr00cmZ6aZttdh3fRyXZVQu5g18zNXQ5KehYCZEIWl3uspEpEG w2LA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:message-id:in-reply-to:references:subject:mime-version :x-original-sender:reply-to:precedence:mailing-list:list-id :list-post:list-help:list-archive:sender:list-subscribe :list-unsubscribe:content-type; bh=yqWptvPOQf8IGEwYA6rCJDJ4frH78hxQyzAKSjDhM5g=; b=CRHqWw1mFbFgBmy7cDbY95kgwIDThZuJP74pedN2x7M/yrZHoUj1AwRU6J7H8QKu1b k42qtGD6lqKIPnnRArsoxNqwyXgywFKb3TxIXE/mUU//2KIW3ksjTU6UJiknC9NysYGR yfnm5e7L00KofASwQz8Px3mBKiA/RfmcxvYFh+zPE8lrYb+7h7O72JjCltGUKu10llgj D7mBqrhuVwsfdYQoGEg6YDrPixjKcDg6qoU/tdDQeVYKoW+G2G5s8Mad3svErgQzJVN9 DQYBiUeeWPlxsfWboZpAn7IFpGXp40gAnMyzhZr1XQTxQGf4G5uXkLGioxJ2czwz2UdO WSfA== X-Received: by 10.140.102.174 with SMTP id w43mr26739qge.14.1403631317781; Tue, 24 Jun 2014 10:35:17 -0700 (PDT) X-BeenThere: lojban@googlegroups.com Received: by 10.140.92.131 with SMTP id b3ls1484833qge.29.gmail; Tue, 24 Jun 2014 10:35:17 -0700 (PDT) X-Received: by 10.140.95.86 with SMTP id h80mr804qge.40.1403631317334; Tue, 24 Jun 2014 10:35:17 -0700 (PDT) Date: Tue, 24 Jun 2014 10:35:16 -0700 (PDT) From: la durka To: lojban@googlegroups.com Message-Id: In-Reply-To: References: Subject: [lojban] Re: jbovlaste updated with camxes-morphology MIME-Version: 1.0 X-Original-Sender: durka42@gmail.com Reply-To: lojban@googlegroups.com Precedence: list Mailing-list: list lojban@googlegroups.com; contact lojban+owners@googlegroups.com List-ID: X-Google-Group-Id: 1004133512417 List-Post: , List-Help: , List-Archive: Sender: lojban@googlegroups.com List-Subscribe: , List-Unsubscribe: , Content-Type: multipart/alternative; boundary="----=_Part_392_28030933.1403631316669" X-Spam-Score: -1.9 (-) X-Spam_score: -1.9 X-Spam_score_int: -18 X-Spam_bar: - ------=_Part_392_28030933.1403631316669 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable FYI, this broke vlasisku's import. I've fixed it in the latest revision at= =20 github.com/lojban/vlasisku (and my Vlasisku instance is running with an=20 updated export from yesterday). As for camxes.lojban.org, I believe it is updated, but I could be wrong.=20 For instance, it rejects {bliardo} but accepts {bliiardo} and=20 {bolrbliardo}. And that leads to my question -- how is {bolrbliardo} legal= =20 but {bliardo} illegal? What is the difference, besides the prefix? mi'e la durka mu'o El martes, 24 de junio de 2014 09:16:26 UTC-4, Riley Martinez-Lynch=20 escribi=C3=B3: > > coi jbopre > > jbovlaste has been updated to apply camxes morphology when new words are= =20 > entered. The new morphological classifier, "vlatai.py" is part of the=20 > camxes-py Python parser, and replaces "vlatai", which is bundled with the= =20 > jbofihe parser. > > vlatai.py adds two types: "bu-letterals" (previously classified as "cmavo= "=20 > or "cmavo cluster") and "zei-lujvo" (previously classified as "lujvo").= =20 > These new types are subject to camxes parser rules: Invalid constructs su= ch=20 > as {bu bu} and {zei zei lujvo} are rejected. > > Other "magic words" such as {zo} and {zoi} are not currently supported in= =20 > combination with {bu} and {zei}. This is an oversight rather than a desig= n=20 > choice, so please feel free to file a bug report if you find this is need= ed. > > The 21,940 valsi currently registered in jbovlaste were verified with the= =20 > new classifier: 21,829 reported no change, 10 were reclassified as=20 > bu-letterals, 26 were reclassified as zei-lujvo, 1 was reclassified from= =20 > fu'ivla to lujvo, and 74 valsi were marked as "obsolete": cmevla (22),=20 > fu'ivla (51) and zei-lujvo (1).=20 > > Details of the reclassified words can be found here: > > https://github.com/lojban/jbovlaste/issues/47 > > https://github.com/lojban/jbovlaste/issues/39 > > https://github.com/lojban/jbovlaste/issues/40 > > https://github.com/lojban/jbovlaste/issues/43 > > https://github.com/lojban/jbovlaste/issues/44 > > The new "obsolete" valsi types are currently treated like the=20 > "experimental" types in XML and PDF exports: They are marked with a=20 > warning. > > la gleki raised the issue that some words (e.g. {relmast}) which don't=20 > conform to this version of camxes, ought to in fact be valid. xorxes note= d=20 > that only older versions of the camxes/BPFK morphology prohibit such word= s. > > I checked {relmast} against the Java/Rats! version of camxes which is=20 > linked on the "Issues With The Lojban Formal Grammar" page: It was not=20 > accepted. It was also not accepted by camxes.js or either the standard or= =20 > experimental ilmentufa grammars. I also checked python-camxes, but it use= s=20 > the same version of the Java jar that was described above. > > I built a new camxes Java/Rats! jar using the latest morphology on the=20 > tiki, and I can confirm that according to this version of the grammar,=20 > {relmast} is valid. However, it's not clear whether such a jar is current= ly=20 > distributed anywhere. > > Based on all of this, my inclination is to update camxes-py as soon as=20 > possible to use the newest BPFK morphology (where "newest" may mean n yea= rs=20 > old). However, if I do this, it will no longer be in sync with most other= =20 > implementations of camxes currently distributed. Thoughts, anyone? > > Thanks to rlpowell and tene for their assistance in getting the new=20 > software installed. > > mi'e la mukti mu'o > --=20 You received this message because you are subscribed to the Google Groups "= lojban" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to lojban+unsubscribe@googlegroups.com. To post to this group, send email to lojban@googlegroups.com. Visit this group at http://groups.google.com/group/lojban. For more options, visit https://groups.google.com/d/optout. ------=_Part_392_28030933.1403631316669 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
FYI, this broke vlasisku's import. I've fixed it in the la= test revision at github.com/lojban/vlasisku (and my Vlasisku instance is ru= nning with an updated export from yesterday).

As for camxes.lojban.o= rg, I believe it is updated, but I could be wrong. For instance, it rejects= {bliardo} but accepts {bliiardo} and {bolrbliardo}. And that leads to my q= uestion -- how is {bolrbliardo} legal but {bliardo} illegal? What is the di= fference, besides the prefix?

mi'e la durka mu'o

El martes, 2= 4 de junio de 2014 09:16:26 UTC-4, Riley Martinez-Lynch escribi=C3=B3:

coi jbopre

jbovlaste has been updated to apply camxes morphology when new words are= entered. The new morphological classifier, "vlatai.py" is part of the camx= es-py Python parser, and replaces "vlatai", which is bundled with the jbofi= he parser.

vlatai.py adds two types: "bu-letterals" (previously classified as "cmav= o" or "cmavo cluster") and "zei-lujvo" (previously classified as "lujvo"). = These new types are subject to camxes parser rules: Invalid constructs such= as {bu bu} and {zei zei lujvo} are rejected.

Other "magic words" such as {zo} and {zoi} are not currently supported i= n combination with {bu} and {zei}. This is an oversight rather than a desig= n choice, so please feel free to file a bug report if you find this is need= ed.

The 21,940 valsi currently registered in jbovlaste were verified with th= e new classifier: 21,829 reported no change, 10 were reclassified as bu-let= terals, 26 were reclassified as zei-lujvo, 1 was reclassified from fu'ivla = to lujvo, and 74 valsi were marked as "obsolete": cmevla (22), fu'ivla (51)= and zei-lujvo (1). 

Details of the reclassified words can be found here:

https://github.com/lojban/jbovlaste/issues/47

https://github.com/lojban/jbovlaste/i= ssues/39

https://github.com/lojban/jbovlas= te/issues/40

https://github.com/lojban/jbo= vlaste/issues/43

https://github.com/lojban/jbovlaste/issues/44

The new "obsolete" valsi types are currently treated like the "experimen= tal" types  in XML and PDF exports: They are marked with a warning.

la gleki raised the issue that some words (e.g. {relmast}) which don't c= onform to this version of camxes, ought to in fact be valid. xorxes noted t= hat only older versions of the camxes/BPFK morphology prohibit such words.<= /p>

I checked {relmast} against the Java/Rats! version of camxes which is li= nked on the "Issues With The Lojban Formal Grammar" page: It was not accept= ed. It was also not accepted by camxes.js or either the standard or experim= ental ilmentufa grammars. I also checked python-camxes, but it uses the sam= e version of the Java jar that was described above.

I built a new camxes Java/Rats! jar using the latest morphology on the t= iki, and I can confirm that according to this version of the grammar, {relm= ast} is valid. However, it's not clear whether such a jar is currently dist= ributed anywhere.

Based on all of this, my inclination is to update camxes-py as soon as p= ossible to use the newest BPFK morphology (where "newest" may mean n years = old). However, if I do this, it will no longer be in sync with most other i= mplementations of camxes currently distributed. Thoughts, anyone?

Thanks to rlpowell and tene for their assistance in getting the new soft= ware installed.

mi'e la mukti mu'o

--
You received this message because you are subscribed to the Google Groups &= quot;lojban" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to lojban+unsub= scribe@googlegroups.com.
To post to this group, send email to lojban@googlegroups.com.
Visit this group at http:= //groups.google.com/group/lojban.
For more options, visit http= s://groups.google.com/d/optout.
------=_Part_392_28030933.1403631316669--