Received: from mail-wm0-f62.google.com ([74.125.82.62]:33215) by stodi.digitalkingdom.org with esmtps (TLSv1.2:AES128-GCM-SHA256:128) (Exim 4.85) (envelope-from ) id 1aCR4j-0004pU-B9 for lojban-list-archive@lojban.org; Fri, 25 Dec 2015 04:02:34 -0800 Received: by mail-wm0-f62.google.com with SMTP id 7sf31766816wmz.0 for ; Fri, 25 Dec 2015 04:02:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20120806; h=mime-version:from:date:message-id:subject:to:content-type :x-original-sender:x-original-authentication-results:reply-to :precedence:mailing-list:list-id:x-spam-checked-in-group:list-post :list-help:list-archive:sender:list-subscribe:list-unsubscribe; bh=6ChEHQAr/64v2fcqcQFZRqhaN1DtAjk7xhSZsJ5vYPw=; b=Q09yW8CI1kQwHwfShVTaNdj7Q3tpd/uFVXDYRykF33Wq1CjKUY7viJLUh2/Nsmshc5 3fsasWaGRynCOXe/t3ZcuXYeQKSou0qrbqzY7fUUqWmv6fY3UKB53JPxZ05t2cNN+bvJ 15DcVbsKuIK8av1AAFG5L/K4B05JXC/oxKFvqK9PcuzCEjJ6EF56oldntzc23CPeDVj9 C/ji0OK95SlQb2AAc6BiHpP44/0dDlj6MJPxoT8+Lf8NUJy/yo8eMDgNDMs9q5/6xFib XZZA8/OztImj+nDBziTuSazC9gqVOFIw3puwgthjl9cFHnEpwxZxKjVB5FOE5CUW/2fK VQNQ== X-Received: by 10.28.11.144 with SMTP id 138mr107474wml.8.1451044938886; Fri, 25 Dec 2015 04:02:18 -0800 (PST) X-BeenThere: lojban@googlegroups.com Received: by 10.28.60.193 with SMTP id j184ls1235073wma.10.canary; Fri, 25 Dec 2015 04:02:18 -0800 (PST) X-Received: by 10.28.85.2 with SMTP id j2mr4006060wmb.4.1451044938256; Fri, 25 Dec 2015 04:02:18 -0800 (PST) Received: from mail-wm0-x22a.google.com (mail-wm0-x22a.google.com. [2a00:1450:400c:c09::22a]) by gmr-mx.google.com with ESMTPS id w129si18139wme.1.2015.12.25.04.02.18 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 25 Dec 2015 04:02:18 -0800 (PST) Received-SPF: pass (google.com: domain of gleki.is.my.name@gmail.com designates 2a00:1450:400c:c09::22a as permitted sender) client-ip=2a00:1450:400c:c09::22a; Received: by mail-wm0-x22a.google.com with SMTP id p187so201593861wmp.1 for ; Fri, 25 Dec 2015 04:02:18 -0800 (PST) X-Received: by 10.194.94.41 with SMTP id cz9mr21369074wjb.169.1451044938133; Fri, 25 Dec 2015 04:02:18 -0800 (PST) MIME-Version: 1.0 Received: by 10.28.92.206 with HTTP; Fri, 25 Dec 2015 04:01:38 -0800 (PST) From: Gleki Arxokuna Date: Fri, 25 Dec 2015 15:01:38 +0300 Message-ID: Subject: [lojban] la cmaxes, a minimal morphology parser To: "lojban@googlegroups.com" Content-Type: multipart/alternative; boundary=047d7bb0473617363f0527b7ba48 X-Original-Sender: gleki.is.my.name@gmail.com X-Original-Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of gleki.is.my.name@gmail.com designates 2a00:1450:400c:c09::22a as permitted sender) smtp.mailfrom=gleki.is.my.name@gmail.com; dmarc=pass (p=NONE dis=NONE) header.from=gmail.com Reply-To: lojban@googlegroups.com Precedence: list Mailing-list: list lojban@googlegroups.com; contact lojban+owners@googlegroups.com List-ID: X-Spam-Checked-In-Group: lojban@googlegroups.com X-Google-Group-Id: 1004133512417 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , X-Spam-Score: -1.7 (-) X-Spam_score: -1.7 X-Spam_score_int: -16 X-Spam_bar: - --047d7bb0473617363f0527b7ba48 Content-Type: text/plain; charset=UTF-8 Here is a short peg.js parser of morphology of Lojban words. Features: 1. only checks for morphology of words, the rest is thrown away. Hence, you don't need much prettification, a simple '[["cmavo","coi"],["cmavo","do"],["cmavo","mi"],["gismu","tavla"],["cmavo","do"]]' is returned. 2. when you need a parser of minimal size. morfologi.js file, the compiled parser ready to use by javascript-compatible apps is under 30 kilobytes of uncompressed (but minified) javascript. 3. can help you study lojban morphology from PEG , which is easier to grasp when everything else is removed. 4. can help restore omitted spaces within compound cmavo and similar (so that you can apply your writing conventions) 5. somewhat faster than the full grammar parser when you run numerous queries. E.g. this parser is now used in la sutysisku app to automatically determine to which word class a given dictionary entry belongs. 6. doesn't support zoi ... zoi quotations (a separate preprocessor needed). 7. http://mw.lojban.org/extensions/ilmentufa/morfologi.html -- You received this message because you are subscribed to the Google Groups "lojban" group. To unsubscribe from this group and stop receiving emails from it, send an email to lojban+unsubscribe@googlegroups.com. To post to this group, send email to lojban@googlegroups.com. Visit this group at https://groups.google.com/group/lojban. For more options, visit https://groups.google.com/d/optout. --047d7bb0473617363f0527b7ba48 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Here is a short peg.js parser of morphology of Lojban= words.

Features:
1. only checks for mor= phology of words, the rest is thrown away. Hence, you don't need much p= rettification, a simple=C2=A0
'[["cmavo","coi&= quot;],["cmavo","do"],["cmavo","mi"= ],["gismu","tavla"],["cmavo","do"]]= '
is returned.
2. when you need a parser of minimal= size.=C2=A0morfologi.js file, the compiled parser ready to use by javascript-c= ompatible apps is under 30 kilobytes of uncompressed (but minified) javascr= ipt.
3. can help you study lojban morphology from PEG, which is e= asier to grasp when everything else is removed.
4. can help r= estore omitted spaces within compound cmavo and similar (so that you can ap= ply your writing conventions)
5. somewhat faster than the full gr= ammar parser when you run numerous queries. E.g. this parser is now used in= la sutysisku=C2=A0app to automatically determine to which w= ord class a given dictionary entry belongs.
6. doesn't suppor= t zoi ... zoi quotations (a separate preprocessor needed).

--
You received this message because you are subscribed to the Google Groups &= quot;lojban" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to lojban+unsub= scribe@googlegroups.com.
To post to this group, send email to lojban@googlegroups.com.
Visit this group at http= s://groups.google.com/group/lojban.
For more options, visit http= s://groups.google.com/d/optout.
--047d7bb0473617363f0527b7ba48--