On 1 June 2012 22:13, Robin Lee Powell &= lt;rlpowel= l@digitalkingdom.org> wrote:

Besides camxes, what have people gotten the Lojban PEG running in?

I've rather intermittently been working on = a Lua[1] version using the LPeg library[2], originally out of plain curiosi= ty as this very light-weight combination (Lua compiler, byte code interpret= er, VM and basic libraries total about 160 kb and the LPeg library is about= 39 kb) allows running a re-notated PEG as a normal Lua program - there is = no parser generator nor a specifically written parser program. There is onl= y one drawback - a parser can be relatively slow as the library doesn't= employ Packrat methodology. This is because it was primarily designed for = pattern matching even in very large, mainly linear data sets, which would c= hoke a Packrat based parser.[3]

After defining the non-terminals for the LPeg, which is= a necessary step so the Lua compiler knows which operators to overload, th= e LPeg notation is a rather simple transformation of the original PEG code.= Here is an example:

final_syllable =3D onset * -y * -stressed * nucleus * -= cmene * #post_word,

stressed_syllable =3D #stressed * syllable + = syllable * #stress,

stressed_diphthong =3D #stressed * diphthong = + diphthong * #stress,

stressed_vowel =3D #stressed * vowel + vowel * #stress,

unst= ressed_syllable =3D -stressed * syllable * -stress + consonantal_syllable,<= /div>

unstressed_diphthong =A0=3D -stressed * diphthong * -stress,

unstressed_vowel =A0=3D -stressed * vowel * -stress,

stress =3D c= onsonant^0 * y^-1 * syllable * pause,

stressed =A0=3D onset * com= ma^0 * S"AEIOU",=A0=A0

In order to handl= e recursion, these statements are put inside an associative array definitio= n, which then serves as the grammar. The left-hand sides are used as indice= s and the right-hand sides as array element values. This way the Lua interp= reter doesn't need to know anything about the recursion, everything is = handled behind the scenes by the LPeg library, which starts from the first = element in the array and traverses it using the non-terminal names in the r= ight-hand sides as indices to access the corresponding rules. This is quite= an ingenious system utilizing the built-in meta-mechanisms of Lua.

I'm just testing the morphology PEG including the c= lassification of cmavo, and my present version seems to work quite decently= unless fed lots of somewhat nasty strings like "rafytestudine". = A three years old, quite average office PC handles "Alice" in 20 = seconds, and the original Asus EeePC (with an 800 MHz Celeron) needs slight= ly less than 2 minutes, which even that is quite decent for many purposes. = The morphology test sentence data set with a lot of nasty words takes 4.5 m= inutes on the office PC. The source text can be fed to the PEG in arbitrary= slices, even the whole test sentence data set as one block.

I made three small changes in the morphology PEG, two o= f which ought not matter in the parser context even in theory and one which= might but did not change the output even from the test sentence data set. = These changes resulted in an about 100% speedup, but might not matter when = using a Packrat parser.

1) removed !cmene from the rule for cmavo

2) = removed !gismu !fuhivla !cmavo from the rule for lujvo

3) moved = =A0 =A0!cmavo from the rule for brivla-head to the rule for fuhivla-head=A0=

The PEG script is compiled for each run, but it doesn&#= 39;t really matter as the compilation takes only about 50 ms on the office = PC. The Lua interpreter is available also during the execution of the progr= am and can be used to run internally generated scripts, which often make th= ings much simpler. A very advanced LuaJIT compiler[4] is also available but= doesn't really help at the PEG stage. It can, however, offer a substan= tial speedup in other parts of the program system.

=A0

I must still check the conversion and do some tidying up= before moving on to the syntax PEG and the glue between the processing sta= ges.

=A0 =A0 Veijo

[1] = =A0http://www.lua.org

[2] =A0http://www= .inf.puc-rio.br/~roberto/lpeg/

[3] =A0http://www.inf.puc-rio.br/~roberto/doc= s/peg.pdf

[4] =A0http://luajit.org

<= br>

--
You received this message because you are subscribed to the Google Groups "= lojban" group.
To post to this group, send email to lojban@googlegroups.com.
To unsubscribe from this group, send email to lojban+unsubscribe@googlegrou= ps.com.
For more options, visit this group at http://groups.google.com/group/lojban= ?hl=3Den.
--14dae9cfc83009eb3a04c25fdca0--