Envelope-to: lojban-list-archive@lojban.org
Delivery-date: Mon, 25 Jan 2021 21:45:49 -0800
Sender: lojban@googlegroups.com
Date: Mon, 25 Jan 2021 21:45:43 -0800 (PST)
From: "gleki.is...@gmail.com" <gleki.is.my.name@gmail.com>
To: lojban <lojban@googlegroups.com>
Message-Id: <29e01fda-fe26-4fde-959f-119dadd9d82fn@googlegroups.com>
In-Reply-To: <86h7n66pji.fsf@cmarib.ramside>
References: <86h7n66pji.fsf@cmarib.ramside>
Subject: [lojban] Re: questions about camxes PEG grammar
MIME-Version: 1.0
Content-Type: multipart/mixed; 
	boundary="----=_Part_234_517993338.1611639943856"
Reply-To: lojban@googlegroups.com
Precedence: list
Mailing-list: list lojban@googlegroups.com; contact lojban+owners@googlegroups.com
X-Spam_score: -2.6
X-Spam_score_int: -25
X-Spam_bar: --

------=_Part_234_517993338.1611639943856
Content-Type: multipart/alternative; 
	boundary="----=_Part_235_1671223889.1611639943856"

------=_Part_235_1671223889.1611639943856
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable


Em segunda-feira, 25 de janeiro de 2021 =C3=A0s 00:35:34 UTC+3,=20
scope845hlang343jbo@icebubble.org escreveu:

> doi la camxes .a la masatos.xagiuaras .a la gleki .a la ilmen .e zo'e=20
>
> I'm looking at the PEG grammar for Lojban in the repository at=20
> https://github.com/lojban/ilmentufa and have some questions...=20
>
>
> le preti xi pa pi'e pa zo'u:=20
>
> {za'a} The build system is written in Javascript, and doesn't use a=20
> traditional build system, i.e. "make". {.uanai} Why


There are implementations in Java (the original camxes), in python and iirc=
=20
in haskell=20

>
> {.e'u} I think the system would be much clearer if it used standard=20
> tools like make, sed, awk, and/or diff/patch. Because PEG files are=20
> line-oriented, awk would probably be preferable to sed. Most systems=20
> that have git (the format of the repository) will already have these=20
> tools. Providing the alternative grammars as diffs would probably be=20
> ideal, because diffs are easily read and understood by human beings,=20
> and can be applied manually, without even needing any extra software=20
> at all. (Any text editor would suffice.)=20
>
> If I actually did use the JS script to build the alternative grammars,=20
> the next thing I'd do would be to run a diff on them, anyway, to see=20
> what changes are made in each of the grammars. So, using diff & patch=20
> would seem to be most sensible.=20
>
>
> le preti xi pa pi'e re zo'u:=20
>
> {za'a} The README.md is rather confusing and written in broken=20
> English. {ru'a} Without intending any offense, it appears as if=20
> English is not the author's native language.=20
>
> {.e'u} I would be happy to discuss the README, off-list, to figure out=20
> what the author means, and help re-work the README to make it clearer=20
> and more understandable.=20
>
>
> le preti xi pa pi'e ci zo'u:=20
>
> {za'a} The .peg files contain a brief description of the syntax of=20
> parsing expressions:=20
>
> ...=20
> # 3) Concatenation is expressed by juxtaposition with no operator symbol.=
=20
> # 4) / represents *ORDERED* alternation (choice). If the first=20
> # option succeeds, the others will never be checked.=20
> # 5) ? indicates that the element to the left is optional.=20
> # 6) * represents optional repetition of the construct to the left.=20
> # 7) + represents one_or_more repetition of the construct to the left.=20
> ...=20
>
> {.uanai} However, it does not specify the order of operations or=20
> relative precedence of each of the operators. For example, does=20
> "a <- b c / d" mean "a <- (b c) / d" or "a <- b (c / d)"? And=20
> does "x <- y / z+" mean "x <- (y / z)+" or "x <- y / (z+)"?=20
>
> {ja'o} One can intuit the intended order of operations by reading the=20
> file, with a knowledge of Lojban grammar, but {.e'u} more specificity=20
> would be helpful.=20
>
>
> le preti xi re pi'e pa zo'u:=20
>
> {za'a} I noticed that you handle elidible terminators differently in=20
> the PEG than in the EBNF. For example, in the EBNF, one of the=20
> alternatives for sumti-6_97 is:=20
>
> LI # mex /LOhO#/=20
>
> But in the PEG, it's:=20
>
> li_clause <- LI_clause free* mex LOhO_elidible free*=20
>
> {ja'o} In the EBNF, the free* are only allowed if the terminator is=20
> present. But, in the PEG, it appears that free* can follow an elided=20
> terminator.=20
>
> {.uanai} What's the rationale for doing this? How do you determine to=20
> which construct the free modifiers apply when the terminator is=20
> omitted?=20
>
>
> le preti xi re pi'e re zo'u:=20
>
> {za'a} There is at least one place in the PEG where the PEG grammar=20
> differs from the EBNF (and the YACC). For example:=20
>
> #stag =3D simple_tense_modal ((jek / joik) simple_tense_modal)*=20
> stag <- simple_tense_modal ((jek / joik) simple_tense_modal)* /=20
> tense_modal (joik_jek tense_modal)*=20
>
> {ja'o} The PEG rule which would comply with the EBNF grammar is=20
> commented-out, and replaced with a rule which allows full "tag"s=20
> (i.e., full {fi'o} clauses) anywhere that only simple tags ("stag"s)=20
> used to be allowed. This changes (by expanding it) the set of=20
> utterances that would represent grammatical Lojban text.=20
>
> {.uanai} What's the rationale for doing this?=20
>
> {.e'u} If the intent is to implement an extension to Lojban's grammar,=20
> shouldn't this be split-out into a separate "experimental" PEG, like=20
> the others?=20
>
>
> le preti xi re pi'e ci zo'u:=20
>
> {za'a} The PEG grammar, in several places, uses constructs like:=20
>
> pehe_sa <- PEhE_clause (!PEhE_clause (sa_word / SA_clause !PEhE_clause))*=
=20
> SA_clause=20
>
> cehe_sa <- CEhE_clause (!CEhE_clause (sa_word / SA_clause !CEhE_clause))*=
=20
> SA_clause=20
>
> This is an idiom which appears repeatedly in the PEG, but there is no=20
> explanation for what this is doing or why.


Must be buried in this mriste or the wiki. It's all BPFK discussions. =20

>
>
> {.e'u} Some higher-level explanation of how erasure words are handled=20
> would be helpful.=20
>
>
> le preti xi re pi'e vo zo'u:=20
>
> {za'a} The PEG contains many non-terminals of the form=20
> "<someword>_clause", "<someword>_pre", "<someword>_post",=20
> "<someword>_sa", "pre_clause", and "post_clause", but there is no=20
> explanation of what this is doing or why.=20
>
> {.e'u} Some higher-level explanation of the conventions used for=20
> naming the non-terminals, and how they interact, would be helpful.=20
>
> .i ki'esai fa'o=20
>

--=20
You received this message because you are subscribed to the Google Groups "=
lojban" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to lojban+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/=
lojban/29e01fda-fe26-4fde-959f-119dadd9d82fn%40googlegroups.com.

------=_Part_235_1671223889.1611639943856
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<br><br><div class=3D"gmail_quote"><div dir=3D"auto" class=3D"gmail_attr">E=
m segunda-feira, 25 de janeiro de 2021 =C3=A0s 00:35:34 UTC+3, scope845hlan=
g343jbo@icebubble.org escreveu:<br></div><blockquote class=3D"gmail_quote" =
style=3D"margin: 0 0 0 0.8ex; border-left: 1px solid rgb(204, 204, 204); pa=
dding-left: 1ex;">doi la camxes .a la masatos.xagiuaras .a la gleki .a la i=
lmen .e zo'e
<br>
<br>I'm looking at the PEG grammar for Lojban in the repository at
<br><a href=3D"https://github.com/lojban/ilmentufa" target=3D"_blank" rel=
=3D"nofollow" data-saferedirecturl=3D"https://www.google.com/url?hl=3Dpt&am=
p;q=3Dhttps://github.com/lojban/ilmentufa&amp;source=3Dgmail&amp;ust=3D1611=
726028750000&amp;usg=3DAFQjCNFdGQx183v1zjfOrs6q1l4MXj9IUw">https://github.c=
om/lojban/ilmentufa</a> and have some questions...
<br>
<br>
<br>le preti xi pa pi'e pa zo'u:
<br>
<br>  {za'a} The build system is written in Javascript, and doesn't use a
<br>  traditional build system, i.e. "make".  {.uanai} Why</blockquote><div=
><br></div><div>There are implementations in Java (the original camxes), in=
 python and iirc in haskell&nbsp;</div><blockquote class=3D"gmail_quote" st=
yle=3D"margin: 0 0 0 0.8ex; border-left: 1px solid rgb(204, 204, 204); padd=
ing-left: 1ex;">
<br>  {.e'u} I think the system would be much clearer if it used standard
<br>  tools like make, sed, awk, and/or diff/patch.  Because PEG files are
<br>  line-oriented, awk would probably be preferable to sed.  Most systems
<br>  that have git (the format of the repository) will already have these
<br>  tools.  Providing the alternative grammars as diffs would probably be
<br>  ideal, because diffs are easily read and understood by human beings,
<br>  and can be applied manually, without even needing any extra software
<br>  at all.  (Any text editor would suffice.)
<br>
<br>  If I actually did use the JS script to build the alternative grammars=
,
<br>  the next thing I'd do would be to run a diff on them, anyway, to see
<br>  what changes are made in each of the grammars.  So, using diff &amp; =
patch
<br>  would seem to be most sensible.
<br>
<br>
<br>le preti xi pa pi'e re zo'u:
<br>
<br>  {za'a} The README.md is rather confusing and written in broken
<br>  English.  {ru'a} Without intending any offense, it appears as if
<br>  English is not the author's native language.
<br>
<br>  {.e'u} I would be happy to discuss the README, off-list, to figure ou=
t
<br>  what the author means, and help re-work the README to make it clearer
<br>  and more understandable.
<br>
<br>
<br>le preti xi pa pi'e ci zo'u:
<br>
<br>  {za'a} The .peg files contain a brief description of the syntax of
<br>  parsing expressions:
<br>
<br>    ...
<br>    # 3)  Concatenation is expressed by juxtaposition with no operator =
symbol.
<br>    # 4)  / represents *ORDERED* alternation (choice).  If the first
<br>    #     option succeeds, the others will never be checked.
<br>    # 5)  ? indicates that the element to the left is optional.
<br>    # 6)  * represents optional repetition of the construct to the left=
.
<br>    # 7)  + represents one_or_more repetition of the construct to the l=
eft.
<br>    ...
<br>
<br>  {.uanai} However, it does not specify the order of operations or
<br>  relative precedence of each of the operators.  For example, does
<br>  "a &lt;- b c / d" mean "a &lt;- (b c) / d" or "a &lt;- b (c / d)"?  A=
nd
<br>  does "x &lt;- y / z+" mean  "x &lt;- (y / z)+" or "x &lt;- y / (z+)"?
<br>
<br>  {ja'o} One can intuit the intended order of operations by reading the
<br>  file, with a knowledge of Lojban grammar, but {.e'u} more specificity
<br>  would be helpful.
<br>
<br>
<br>le preti xi re pi'e pa zo'u:
<br>
<br>  {za'a} I noticed that you handle elidible terminators differently in
<br>  the PEG than in the EBNF.  For example, in the EBNF, one of the
<br>  alternatives for sumti-6_97 is:
<br>
<br>    LI # mex /LOhO#/
<br>
<br>  But in the PEG, it's:
<br>
<br>    li_clause &lt;- LI_clause free* mex LOhO_elidible free*
<br>
<br>  {ja'o} In the EBNF, the free* are only allowed if the terminator is
<br>  present.  But, in the PEG, it appears that free* can follow an elided
<br>  terminator.
<br>
<br>  {.uanai} What's the rationale for doing this?  How do you determine t=
o
<br>  which construct the free modifiers apply when the terminator is
<br>  omitted?
<br>
<br>
<br>le preti xi re pi'e re zo'u:
<br>
<br>  {za'a} There is at least one place in the PEG where the PEG grammar
<br>  differs from the EBNF (and the YACC).  For example:
<br>
<br>    #stag =3D simple_tense_modal ((jek / joik) simple_tense_modal)*
<br>    stag &lt;- simple_tense_modal ((jek / joik) simple_tense_modal)* / =
tense_modal (joik_jek tense_modal)*
<br>
<br>  {ja'o} The PEG rule which would comply with the EBNF grammar is
<br>  commented-out, and replaced with a rule which allows full "tag"s
<br>  (i.e., full {fi'o} clauses) anywhere that only simple tags ("stag"s)
<br>  used to be allowed.  This changes (by expanding it) the set of
<br>  utterances that would represent grammatical Lojban text.
<br>
<br>  {.uanai} What's the rationale for doing this?
<br>
<br>  {.e'u} If the intent is to implement an extension to Lojban's grammar=
,
<br>  shouldn't this be split-out into a separate "experimental" PEG, like
<br>  the others?
<br>
<br>
<br>le preti xi re pi'e ci zo'u:
<br>
<br>  {za'a} The PEG grammar, in several places, uses constructs like:
<br>
<br>    pehe_sa &lt;- PEhE_clause (!PEhE_clause (sa_word / SA_clause !PEhE_=
clause))* SA_clause
<br>
<br>    cehe_sa &lt;- CEhE_clause (!CEhE_clause (sa_word / SA_clause !CEhE_=
clause))* SA_clause
<br>
<br>  This is an idiom which appears repeatedly in the PEG, but there is no
<br>  explanation for what this is doing or why.</blockquote><div><br></div=
><div><br></div><div><br></div><div>Must be buried in this mriste or the wi=
ki. It's all BPFK discussions.&nbsp;&nbsp;</div><blockquote class=3D"gmail_=
quote" style=3D"margin: 0 0 0 0.8ex; border-left: 1px solid rgb(204, 204, 2=
04); padding-left: 1ex;">
<br>
<br>  {.e'u} Some higher-level explanation of how erasure words are handled
<br>  would be helpful.
<br>
<br>
<br>le preti xi re pi'e vo zo'u:
<br>
<br>  {za'a} The PEG contains many non-terminals of the form
<br>  "&lt;someword&gt;_clause", "&lt;someword&gt;_pre", "&lt;someword&gt;_=
post",
<br>  "&lt;someword&gt;_sa", "pre_clause", and "post_clause", but there is =
no
<br>  explanation of what this is doing or why.
<br>
<br>  {.e'u} Some higher-level explanation of the conventions used for
<br>  naming the non-terminals, and how they interact, would be helpful.
<br>
<br>.i ki'esai fa'o
<br></blockquote></div>

<p></p>

-- <br />
You received this message because you are subscribed to the Google Groups &=
quot;lojban&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:lojban+unsubscribe@googlegroups.com">lojban+unsub=
scribe@googlegroups.com</a>.<br />
To view this discussion on the web visit <a href=3D"https://groups.google.c=
om/d/msgid/lojban/29e01fda-fe26-4fde-959f-119dadd9d82fn%40googlegroups.com?=
utm_medium=3Demail&utm_source=3Dfooter">https://groups.google.com/d/msgid/l=
ojban/29e01fda-fe26-4fde-959f-119dadd9d82fn%40googlegroups.com</a>.<br />

------=_Part_235_1671223889.1611639943856--

------=_Part_234_517993338.1611639943856--