[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lojban] la cmaxes, a minimal morphology parser





2015-12-25 17:14 GMT+03:00 Jorge Llambías <jjllambias@gmail.com>:


On Fri, Dec 25, 2015 at 10:59 AM, Gleki Arxokuna <gleki.is.my.name@gmail.com> wrote:
bgv = [bgv] hgu

jz = [jz] hgu

cs = [cs] hgv !cs !x
oops, the website wasn't updated. I will fix later. Or you can just clear appcache for it.
" !x" isn't necessary here at all. i removed it:
http://mw.lojban.org/extensions/ilmentufa/morfologi.js.peg

But then you allow bacxa 
 
pf = [pf] hgv

Unfortunately, you can't do this. The !x after cs is wrong because it will reject for example "vasxu". But more importantly no consonant follows the same rules of any other consonant. You removed the restriction against double consonants, so "babba" will parse as a gismu.

The only two letters that share identical rules are e and o.

Indeed, thanks for noticing. I need to explain this parser better because it changes something in ideology.

Namely, it preprocesses input using a bunch or regexes.
So {zk} turns into {zyk}, {bb} into {byb} etc.
The idea is that the parser expects correct language in its input and determine word classes, but not show mistakes in the input.

If only correct language is expected as input, then why have any restrictions at all? Why is the !cs needed, for example?

Yes, it isn't needed either. 


And what's the point of handling with a preparser things that PEG can handle just fine? It seems that you're making the morphology harder, not easier, to grasp by hiding some things in the preparser.

Well, then such parser can be just forked into another one for learning morphology by humans.
The current one is mostly to be used to quickly restore word classes in words that are assumed to be grammatical and to restore spaces to cmavo compounds.


mu'o mi'e xorxes

--
You received this message because you are subscribed to the Google Groups "lojban" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lojban+unsubscribe@googlegroups.com.
To post to this group, send email to lojban@googlegroups.com.
Visit this group at https://groups.google.com/group/lojban.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "lojban" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lojban+unsubscribe@googlegroups.com.
To post to this group, send email to lojban@googlegroups.com.
Visit this group at https://groups.google.com/group/lojban.
For more options, visit https://groups.google.com/d/optout.