Received: from mail-yb0-f190.google.com ([209.85.213.190]:34156) by stodi.digitalkingdom.org with esmtps (TLSv1.2:ECDHE-RSA-AES128-GCM-SHA256:128) (Exim 4.87) (envelope-from ) id 1cQ9hp-0005JT-7l for lojban-list-archive@lojban.org; Sun, 08 Jan 2017 01:24:08 -0800 Received: by mail-yb0-f190.google.com with SMTP id l23sf74207408ybj.1 for ; Sun, 08 Jan 2017 01:24:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:date:from:to:cc:message-id:in-reply-to:references:subject :mime-version:x-original-sender:reply-to:precedence:mailing-list :list-id:x-spam-checked-in-group:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=Lyp5vGJZmAJ7kwu//G4cH7D1T+EddFXvOg2nEPR6Ex4=; b=HMVrB/+NpWk9lTXQrJTxFGPxQZyRKIZFzZxajtWDkYfAqlRoStR5NL1vUAmvHSBp4G 6TTW/h3fHcDinXDZDNPqqS6OZzLY6HDOFuZfUjZf25am8Tbs//O9/ZdAMj3rE8ajIqbr ob6TF2kj5Op0Xo6n3Q9+8aHJkV/RKQAX6HVJlMeX2P2D0TKTPRVX+YoYAVxXVv9bwynH CaiMZZC6/1rYW0499OT4s1YBNmR5F/MuZXm3aLfTox8/VzGIzFveqjkkRUd8kGpaf62X 9hzvUwatoabsC+Eg8XEhga0HyBlQKHZK3a/TKa5DGWejyyFVhjGPwURox8Zh6nE4jg6X ugAw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:message-id:in-reply-to:references:subject :mime-version:x-original-sender:reply-to:precedence:mailing-list :list-id:x-spam-checked-in-group:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=Lyp5vGJZmAJ7kwu//G4cH7D1T+EddFXvOg2nEPR6Ex4=; b=P4gWHnPaxAkaRB1aDSMNUFs04IQgOFLamFJnCjFKAoy8V0GXqqNPHhPzCpDZMJ7U91 n7XmP/Gx2jAojaoTXEOTarxjSxmPD4VsPBuTyiLAIUKh885EmQ/JaZB594c4vr3ZmGXR FeE24Afv1QQQ6fuSsbHoCe7I7QzobZYeYPUDaqzu0MxNbku1SvCbqDJVHgTUsSunkup+ QZJhcN7eyVp2S9jRWlKIdqBJ0mf8pjgomM/ne4Ng7QIKq26wDM8ZrNDrPue6H7mbV0cd WHegEwsVrUDbvyZNT8g5Iigm+c5vtZXVXblpK9F4fUMs8MFXJEQA4wgYO6oZThUmv3QB Ch3Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:date:from:to:cc:message-id:in-reply-to :references:subject:mime-version:x-original-sender:reply-to :precedence:mailing-list:list-id:x-spam-checked-in-group:list-post :list-help:list-archive:list-subscribe:list-unsubscribe; bh=Lyp5vGJZmAJ7kwu//G4cH7D1T+EddFXvOg2nEPR6Ex4=; b=lusY3r3qR4NN9T3m2EpK2ZKImPhiYrqL32V0V/FioaQXQYWcxCMkMCANFGXxkse7YQ 0FMmGrYgu2AbJV7ea1JVjvzDOK3KcR6zBC313qDFDHel+U1fxBTgacG0pjYmObNV9ncL ZyEfouRnD7ye5X4TEMqSfTTHK9hhqFEPN5K46QFg+tGVYebtWgtR05fikQU69MIG/MHI VNHSXevzIhDu80tF7D/WtCsHPV+z1MHb8ttfyU+hdjaB3plxBHkxEtL2X5eNZnDIw3yB p1cdOhvzBYK6tXKGHPF0ijEWEGWW/3q9+itZpKGj+ozJHmOcFmMg8A0h2GcwLA7y+CDp s/CA== Sender: lojban@googlegroups.com X-Gm-Message-State: AIkVDXJVvNdmn0TehZMiuTm3bn/PiXLT2hNvDYEd+/vLLe61FgFD0jazXjxVtVhI5eTV0A== X-Received: by 10.157.43.215 with SMTP id u81mr133248ota.15.1483867434707; Sun, 08 Jan 2017 01:23:54 -0800 (PST) X-BeenThere: lojban@googlegroups.com Received: by 10.157.21.124 with SMTP id z57ls4831073otz.38.gmail; Sun, 08 Jan 2017 01:23:54 -0800 (PST) X-Received: by 10.157.14.183 with SMTP id 52mr668893otj.20.1483867434362; Sun, 08 Jan 2017 01:23:54 -0800 (PST) Date: Sun, 8 Jan 2017 01:23:53 -0800 (PST) From: gleki.is.my.name@gmail.com To: lojban Cc: gleki.is.my.name@gmail.com Message-Id: In-Reply-To: <4a174e16-b53a-46dd-9e06-130b27c52fe0@googlegroups.com> References: <4a174e16-b53a-46dd-9e06-130b27c52fe0@googlegroups.com> Subject: [lojban] Re: la cmaxes, a minimal morphology parser MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_526_1182329862.1483867434039" X-Original-Sender: gleki.is.my.name@gmail.com Reply-To: lojban@googlegroups.com Precedence: list Mailing-list: list lojban@googlegroups.com; contact lojban+owners@googlegroups.com List-ID: X-Spam-Checked-In-Group: lojban@googlegroups.com X-Google-Group-Id: 1004133512417 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , X-Spam-Score: -1.5 (-) X-Spam_score: -1.5 X-Spam_score_int: -14 X-Spam_bar: - ------=_Part_526_1182329862.1483867434039 Content-Type: multipart/alternative; boundary="----=_Part_527_1912744482.1483867434039" ------=_Part_527_1912744482.1483867434039 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Em ter=C3=A7a-feira, 20 de dezembro de 2016 18:05:21 UTC+3, cogas uasanbon= =20 escreveu: > > > > 2015=E5=B9=B412=E6=9C=8825=E6=97=A5=E9=87=91=E6=9B=9C=E6=97=A5 21=E6=99= =8202=E5=88=8619=E7=A7=92 UTC+9 Gleki Arxokuna: >> >> Here is a short peg.js parser of morphology of Lojban words. >> >> Features: >> 1. only checks for morphology of words, the rest is thrown away. Hence,= =20 >> you don't need much prettification, a simple=20 >> >> '[["cmavo","coi"],["cmavo","do"],["cmavo","mi"],["gismu","tavla"],["cmav= o","do"]]' >> is returned. >> 2. when you need a parser of minimal size. morfologi.js=20 >> file, the=20 >> compiled parser ready to use by javascript-compatible apps is under 30= =20 >> kilobytes of uncompressed (but minified) javascript. >> 3. can help you study lojban morphology from PEG=20 >> , which is= =20 >> easier to grasp when everything else is removed. >> 4. can help restore omitted spaces within compound cmavo and similar (so= =20 >> that you can apply your writing conventions) >> 5. somewhat faster than the full grammar parser when you run numerous=20 >> queries. E.g. this parser is now used in la sutysisku=20 >> = app=20 >> to automatically determine to which word class a given dictionary entry= =20 >> belongs. >> 6. doesn't support zoi ... zoi quotations (a separate preprocessor=20 >> needed). >> 7. http://mw.lojban.org/extensions/ilmentufa/morfologi.html >> > > {zarrja}, {zallja}, {zammja}, {zannja} are grammatical in this ilmentufa.= =20 > Is that a bug? > It detects word classes of grammatically correct words.=20 --=20 You received this message because you are subscribed to the Google Groups "= lojban" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to lojban+unsubscribe@googlegroups.com. To post to this group, send email to lojban@googlegroups.com. Visit this group at https://groups.google.com/group/lojban. For more options, visit https://groups.google.com/d/optout. ------=_Part_527_1912744482.1483867434039 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable


Em ter=C3=A7a-feira, 20 de dezembro de 2016 18:05:= 21 UTC+3, cogas uasanbon escreveu:


2015=E5=B9=B412=E6=9C=8825=E6=97=A5=E9=87=91= =E6=9B=9C=E6=97=A5 21=E6=99=8202=E5=88=8619=E7=A7=92 UTC+9 Gleki Arxokuna:<= blockquote class=3D"gmail_quote" style=3D"margin:0;margin-left:0.8ex;border= -left:1px #ccc solid;padding-left:1ex">
Here is a shor= t peg.js parser of morphology of Lojban words.

Fea= tures:
1. only checks for morphology of words, the rest is thrown= away. Hence, you don't need much prettification, a simple=C2=A0
<= div>'[["cmavo","coi"],["cmavo","do"],["cmavo","mi"],["gismu","tavla"],["cmavo","do"]]'
is returne= d.
2. when you need a parser of minimal size.=C2=A0morfo= logi.js file, the compiled parser ready to use by javascript-compatible= apps is under 30 kilobytes of uncompressed (but minified) javascript.
3. can help you study lojban morphology from PEG= , which is easier to grasp when everything else is removed.
4. can help restore omitted spaces within compound cmavo and similar (so= that you can apply your writing conventions)
5. somewhat faster = than the full grammar parser when you run numerous queries. E.g. this parse= r is now used in la sutysisku=C2=A0app to automatically determi= ne to which word class a given dictionary entry belongs.
6. doesn= 't support zoi ... zoi quotations (a separate preprocessor needed).

{zarr= ja}, {zallja}, {zammja}, {zannja} are grammatical in this ilmentufa. Is tha= t a bug?

It detects word classe= s of grammatically correct words.=C2=A0

--
You received this message because you are subscribed to the Google Groups &= quot;lojban" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to lojban+unsub= scribe@googlegroups.com.
To post to this group, send email to lojban@googlegroups.com.
Visit this group at http= s://groups.google.com/group/lojban.
For more options, visit http= s://groups.google.com/d/optout.
------=_Part_527_1912744482.1483867434039-- ------=_Part_526_1182329862.1483867434039--