Received: from mail-yk0-f185.google.com ([209.85.160.185]:58467) by stodi.digitalkingdom.org with esmtps (TLSv1:RC4-SHA:128) (Exim 4.80.1) (envelope-from ) id 1WzQaP-0003O4-RX for lojban-list-archive@lojban.org; Tue, 24 Jun 2014 06:16:34 -0700 Received: by mail-yk0-f185.google.com with SMTP id 19sf20914ykq.22 for ; Tue, 24 Jun 2014 06:16:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20120806; h=date:from:to:message-id:subject:mime-version:x-original-sender :reply-to:precedence:mailing-list:list-id:list-post:list-help :list-archive:sender:list-subscribe:list-unsubscribe:content-type; bh=4FK6COMVKHLMHEBdFH/s7kRTZOAR4BvL4D+6TCbe7iE=; b=h5+njVRzlOYxszGbGRTWAWJbOOcZz/AfjnKyUgoNrHa5tc/T64naAn2Nq0r7YUwsUn s44W3IBLTJMZxMU4Ro/qa4NY6fpF0H1wD/at2uumn+Lae8Q/RfjidRM4PEFqMzxNos9g b7ndmJrXwDzZ+xbCdIZKNFw6RDq5RZ/DRFrxn/iQqDJDTrww9DyZCsLmzZLg4NdAZuJR bu7kKsWVJqwtpH4vU5whUx3rpGozIvThPYqKOelf5/pHmnIkBDAakjHvVs/Y5timjCfA nu+2vSVTRyHNPzcqNmnvojcmnwZYWjltElg8I+J9QzUo5CFZj2w5HfjzrWwm3E9LuQcV dVBQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:message-id:subject:mime-version:x-original-sender :reply-to:precedence:mailing-list:list-id:list-post:list-help :list-archive:sender:list-subscribe:list-unsubscribe:content-type; bh=4FK6COMVKHLMHEBdFH/s7kRTZOAR4BvL4D+6TCbe7iE=; b=Lm5VpZ3n7g0pKUckZKIQOr4d+3JjmMBsBqldqHzePxCT+ZyFaVoRBzSX2eQjPPMEuA 6jvWaB30Q4zY7G6GYcO3I4juSKZYrywh8Qv3EdkZQU2aRrFGVNr8fNCuvZXein2qhpbT iVxIugK5f7mobkPkkYl0tAkh8HZrbxu0K67WeFHEHNqakUBl4qxFNHREAgOMJ7yC3HZM T36E6L2yJ3VrxFczcDr5G2+ETnFp+dEq9rACQY3NzgUvRe0dITxxiBRuS51x0andpSVU XRInwrhWvXJXtQ4NPq6NBb86DG0ub5oCjJskTdrd6xA/uEcjuAjPjtx6iPqewq7ibzAN q8Qg== X-Received: by 10.182.165.36 with SMTP id yv4mr4561obb.18.1403615787524; Tue, 24 Jun 2014 06:16:27 -0700 (PDT) X-BeenThere: lojban@googlegroups.com Received: by 10.182.81.38 with SMTP id w6ls770745obx.17.gmail; Tue, 24 Jun 2014 06:16:26 -0700 (PDT) X-Received: by 10.182.91.37 with SMTP id cb5mr4433obb.0.1403615786899; Tue, 24 Jun 2014 06:16:26 -0700 (PDT) Date: Tue, 24 Jun 2014 06:16:26 -0700 (PDT) From: Riley Martinez-Lynch To: lojban@googlegroups.com Message-Id: Subject: [lojban] jbovlaste updated with camxes-morphology MIME-Version: 1.0 X-Original-Sender: shunpiker@gmail.com Reply-To: lojban@googlegroups.com Precedence: list Mailing-list: list lojban@googlegroups.com; contact lojban+owners@googlegroups.com List-ID: X-Google-Group-Id: 1004133512417 List-Post: , List-Help: , List-Archive: Sender: lojban@googlegroups.com List-Subscribe: , List-Unsubscribe: , Content-Type: multipart/alternative; boundary="----=_Part_35_2577810.1403615786183" X-Spam-Score: -1.9 (-) X-Spam_score: -1.9 X-Spam_score_int: -18 X-Spam_bar: - ------=_Part_35_2577810.1403615786183 Content-Type: text/plain; charset=UTF-8 coi jbopre jbovlaste has been updated to apply camxes morphology when new words are entered. The new morphological classifier, "vlatai.py" is part of the camxes-py Python parser, and replaces "vlatai", which is bundled with the jbofihe parser. vlatai.py adds two types: "bu-letterals" (previously classified as "cmavo" or "cmavo cluster") and "zei-lujvo" (previously classified as "lujvo"). These new types are subject to camxes parser rules: Invalid constructs such as {bu bu} and {zei zei lujvo} are rejected. Other "magic words" such as {zo} and {zoi} are not currently supported in combination with {bu} and {zei}. This is an oversight rather than a design choice, so please feel free to file a bug report if you find this is needed. The 21,940 valsi currently registered in jbovlaste were verified with the new classifier: 21,829 reported no change, 10 were reclassified as bu-letterals, 26 were reclassified as zei-lujvo, 1 was reclassified from fu'ivla to lujvo, and 74 valsi were marked as "obsolete": cmevla (22), fu'ivla (51) and zei-lujvo (1). Details of the reclassified words can be found here: https://github.com/lojban/jbovlaste/issues/47 https://github.com/lojban/jbovlaste/issues/39 https://github.com/lojban/jbovlaste/issues/40 https://github.com/lojban/jbovlaste/issues/43 https://github.com/lojban/jbovlaste/issues/44 The new "obsolete" valsi types are currently treated like the "experimental" types in XML and PDF exports: They are marked with a warning. la gleki raised the issue that some words (e.g. {relmast}) which don't conform to this version of camxes, ought to in fact be valid. xorxes noted that only older versions of the camxes/BPFK morphology prohibit such words. I checked {relmast} against the Java/Rats! version of camxes which is linked on the "Issues With The Lojban Formal Grammar" page: It was not accepted. It was also not accepted by camxes.js or either the standard or experimental ilmentufa grammars. I also checked python-camxes, but it uses the same version of the Java jar that was described above. I built a new camxes Java/Rats! jar using the latest morphology on the tiki, and I can confirm that according to this version of the grammar, {relmast} is valid. However, it's not clear whether such a jar is currently distributed anywhere. Based on all of this, my inclination is to update camxes-py as soon as possible to use the newest BPFK morphology (where "newest" may mean n years old). However, if I do this, it will no longer be in sync with most other implementations of camxes currently distributed. Thoughts, anyone? Thanks to rlpowell and tene for their assistance in getting the new software installed. mi'e la mukti mu'o -- You received this message because you are subscribed to the Google Groups "lojban" group. To unsubscribe from this group and stop receiving emails from it, send an email to lojban+unsubscribe@googlegroups.com. To post to this group, send email to lojban@googlegroups.com. Visit this group at http://groups.google.com/group/lojban. For more options, visit https://groups.google.com/d/optout. ------=_Part_35_2577810.1403615786183 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

coi jbopre

jbovlaste has been updated to apply camxes morphology when = new words are entered. The new morphological classifier, "vlatai.py" is par= t of the camxes-py Python parser, and replaces "vlatai", which is bundled w= ith the jbofihe parser.

vlatai.py adds two types: "bu-letterals" (previously classi= fied as "cmavo" or "cmavo cluster") and "zei-lujvo" (previously classified = as "lujvo"). These new types are subject to camxes parser rules: Invalid co= nstructs such as {bu bu} and {zei zei lujvo} are rejected.

Other "magic words" such as {zo} and {zoi} are not currentl= y supported in combination with {bu} and {zei}. This is an oversight rather= than a design choice, so please feel free to file a bug report if you find= this is needed.

The 21,940 valsi currently registered in jbovlaste were ver= ified with the new classifier: 21,829 reported no change, 10 were reclassif= ied as bu-letterals, 26 were reclassified as zei-lujvo, 1 was reclassified = from fu'ivla to lujvo, and 74 valsi were marked as "obsolete": cmevla (22),= fu'ivla (51) and zei-lujvo (1). 

Details of the reclassified words can be found here:

<= a href=3D"https://github.com/lojban/jbovlaste/issues/47">https://github.com= /lojban/jbovlaste/issues/47

https://github.com/lojban/j= bovlaste/issues/39

https://github.com/lojban/jbovlaste/issues/40<= /p>

https://github.com/lojban/jbovlaste/issues/43

https://github.com/lo= jban/jbovlaste/issues/44

The new "obsolete" valsi types are currently treated like t= he "experimental" types  in XML and PDF exports: They are marked with = a warning.

la gleki raised the issue that some words (e.g. {relmast}) = which don't conform to this version of camxes, ought to in fact be valid. x= orxes noted that only older versions of the camxes/BPFK morphology prohibit= such words.

I checked {relmast} against the Java/Rats! version of camxe= s which is linked on the "Issues With The Lojban Formal Grammar" page: It w= as not accepted. It was also not accepted by camxes.js or either the standa= rd or experimental ilmentufa grammars. I also checked python-camxes, but it= uses the same version of the Java jar that was described above.

I built a new camxes Java/Rats! jar using the latest morpho= logy on the tiki, and I can confirm that according to this version of the g= rammar, {relmast} is valid. However, it's not clear whether such a jar is c= urrently distributed anywhere.

Based on all of this, my inclination is to update camxes-py= as soon as possible to use the newest BPFK morphology (where "newest" may = mean n years old). However, if I do this, it will no longer be in sync with= most other implementations of camxes currently distributed. Thoughts, anyo= ne?

Thanks to rlpowell and tene for their assistance in getting= the new software installed.

mi'e la mukti mu'o

--
You received this message because you are subscribed to the Google Groups &= quot;lojban" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to lojban+unsub= scribe@googlegroups.com.
To post to this group, send email to lojban@googlegroups.com.
Visit this group at http:= //groups.google.com/group/lojban.
For more options, visit http= s://groups.google.com/d/optout.
------=_Part_35_2577810.1403615786183--