Return-path: Envelope-to: lojban-list-archive@lojban.org Delivery-date: Mon, 25 Jan 2021 21:45:49 -0800 Received: from mail-ot1-f59.google.com ([209.85.210.59]:50743) by stodi.digitalkingdom.org with esmtps (TLS1.3) tls TLS_AES_128_GCM_SHA256 (Exim 4.94) (envelope-from ) id 1l4HAg-002N6e-Dz for lojban-list-archive@lojban.org; Mon, 25 Jan 2021 21:45:49 -0800 Received: by mail-ot1-f59.google.com with SMTP id b26sf7164338oti.17 for ; Mon, 25 Jan 2021 21:45:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:date:from:to:message-id:in-reply-to:references:subject :mime-version:x-original-sender:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:list-subscribe :list-unsubscribe; bh=+Cst429085M3HbAzElaU9uW6A8pFN/h8Lv/OMvnW9Ds=; b=euGzt4Jh2uudlwUD7vOGwnfLLdWYLrGNd0E6vlggtmHae2OsebvovFyKT+hV/ZccC5 Oto7QX9y3TLUmEIrpBvQchgLGwTppn5o0CV7dYu+6meF1abRlUTpAraeK08DWb9TYadg VKX2VaFjkNkx/XWiu6ymk2KgmuppHhsqJ1uu8j12Ow6BVqD1YskDqYP++vYrL8X0Dxt5 qxvh2WgS9dZqT6se5NK//A1puAFEG5vnDlBakOgeQFd32pQFXjKA8dKUTmTMmjXjicwD 4HorELTsL/jW56MzR9UWAqbkHfzg4uVwX54uENr03BRP/DRRKXo7lrB1+Gf4xrbCQVHq SjhQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:message-id:in-reply-to:references:subject:mime-version :x-original-sender:reply-to:precedence:mailing-list:list-id :list-post:list-help:list-archive:list-subscribe:list-unsubscribe; bh=+Cst429085M3HbAzElaU9uW6A8pFN/h8Lv/OMvnW9Ds=; b=neZfEDu0TsNf/P46ABcQDjX/zIC4HC15K/jnqn+b0HVhN+gV0AXZEScKOLJXNRHgOO Ejk1R3VOW7JqqRmqJql8EieRGjZokZ6hIScyyuuJ5+xwlO1bFuvGUJK9tvbd/w9rB3BE FU0j0Kf+TiWG90GYhaUVdf6/WK8K+UBp7COPCRS8dI6TbxNhOIfQSJK86i5aTqW02GSo JUJt+koS44nm8DQEBEIvGYaxTjkMUCWQ8KumYlzz4BKnbKuVcL/E8OMOJhCTtqT2/4hz seXoMgNGsX8MW6E8v//eW1lKAADJGfGpvXdDk8pYGPEF41QbqlAH3Eox7gVtkecWWFWf H0Jg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:date:from:to:message-id:in-reply-to :references:subject:mime-version:x-original-sender:reply-to :precedence:mailing-list:list-id:x-spam-checked-in-group:list-post :list-help:list-archive:list-subscribe:list-unsubscribe; bh=+Cst429085M3HbAzElaU9uW6A8pFN/h8Lv/OMvnW9Ds=; b=dwPV+HqhEEXEEFVxFh72IcpjtlQqq+8DTNS9HnBBU2iArC2oSQ8Cb+a9yX+mrABH6G 9Y43PQvDcmGBmejxn7MFBSricTi3Z7NETTVkSuKb8durFcw66JL8pYKxVuUPNzFbsRLg IaTtq9Abgayz3W89mL2F8NYxxFgrWStYFo2cwc/WG317FvwoyvtqDCPvsDxAf1jq2Zkx fdJy+bFTKdWFoArL88pm+ne8HSCp3KEx4WpbS5ZM7EyG4vctWYacDAj+4UT9u8wFiU1M F5KXwzfhZI16WqH1jDHG168fjgDkaukraKsgZqfBvGH9P5Q5xs0+1nHJhCYachm/bEt6 kYKw== Sender: lojban@googlegroups.com X-Gm-Message-State: AOAM530TknZ6ItVvod8qd26YV7zgHiBWd+USCU+gM3k6IOUhhvQ7afbV VPpko3dhL0yQ++WX09+69+0= X-Google-Smtp-Source: ABdhPJxN5kvYshlockDDVDywBQOaBmWB7Di/tzNP1h/6m9PChRNelvC7h6H77ldGPgam1PmPXvOQpg== X-Received: by 2002:a05:6830:1318:: with SMTP id p24mr2864429otq.302.1611639945488; Mon, 25 Jan 2021 21:45:45 -0800 (PST) X-BeenThere: lojban@googlegroups.com Received: by 2002:a05:6830:14a:: with SMTP id j10ls3238047otp.4.gmail; Mon, 25 Jan 2021 21:45:44 -0800 (PST) X-Received: by 2002:a9d:347:: with SMTP id 65mr2910710otv.4.1611639944641; Mon, 25 Jan 2021 21:45:44 -0800 (PST) Date: Mon, 25 Jan 2021 21:45:43 -0800 (PST) From: "gleki.is...@gmail.com" To: lojban Message-Id: <29e01fda-fe26-4fde-959f-119dadd9d82fn@googlegroups.com> In-Reply-To: <86h7n66pji.fsf@cmarib.ramside> References: <86h7n66pji.fsf@cmarib.ramside> Subject: [lojban] Re: questions about camxes PEG grammar MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_234_517993338.1611639943856" X-Original-Sender: gleki.is.my.name@gmail.com Reply-To: lojban@googlegroups.com Precedence: list Mailing-list: list lojban@googlegroups.com; contact lojban+owners@googlegroups.com List-ID: X-Spam-Checked-In-Group: lojban@googlegroups.com X-Google-Group-Id: 1004133512417 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , X-Spam-Score: -2.6 (--) X-Spam_score: -2.6 X-Spam_score_int: -25 X-Spam_bar: -- ------=_Part_234_517993338.1611639943856 Content-Type: multipart/alternative; boundary="----=_Part_235_1671223889.1611639943856" ------=_Part_235_1671223889.1611639943856 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Em segunda-feira, 25 de janeiro de 2021 =C3=A0s 00:35:34 UTC+3,=20 scope845hlang343jbo@icebubble.org escreveu: > doi la camxes .a la masatos.xagiuaras .a la gleki .a la ilmen .e zo'e=20 > > I'm looking at the PEG grammar for Lojban in the repository at=20 > https://github.com/lojban/ilmentufa and have some questions...=20 > > > le preti xi pa pi'e pa zo'u:=20 > > {za'a} The build system is written in Javascript, and doesn't use a=20 > traditional build system, i.e. "make". {.uanai} Why There are implementations in Java (the original camxes), in python and iirc= =20 in haskell=20 > > {.e'u} I think the system would be much clearer if it used standard=20 > tools like make, sed, awk, and/or diff/patch. Because PEG files are=20 > line-oriented, awk would probably be preferable to sed. Most systems=20 > that have git (the format of the repository) will already have these=20 > tools. Providing the alternative grammars as diffs would probably be=20 > ideal, because diffs are easily read and understood by human beings,=20 > and can be applied manually, without even needing any extra software=20 > at all. (Any text editor would suffice.)=20 > > If I actually did use the JS script to build the alternative grammars,=20 > the next thing I'd do would be to run a diff on them, anyway, to see=20 > what changes are made in each of the grammars. So, using diff & patch=20 > would seem to be most sensible.=20 > > > le preti xi pa pi'e re zo'u:=20 > > {za'a} The README.md is rather confusing and written in broken=20 > English. {ru'a} Without intending any offense, it appears as if=20 > English is not the author's native language.=20 > > {.e'u} I would be happy to discuss the README, off-list, to figure out=20 > what the author means, and help re-work the README to make it clearer=20 > and more understandable.=20 > > > le preti xi pa pi'e ci zo'u:=20 > > {za'a} The .peg files contain a brief description of the syntax of=20 > parsing expressions:=20 > > ...=20 > # 3) Concatenation is expressed by juxtaposition with no operator symbol.= =20 > # 4) / represents *ORDERED* alternation (choice). If the first=20 > # option succeeds, the others will never be checked.=20 > # 5) ? indicates that the element to the left is optional.=20 > # 6) * represents optional repetition of the construct to the left.=20 > # 7) + represents one_or_more repetition of the construct to the left.=20 > ...=20 > > {.uanai} However, it does not specify the order of operations or=20 > relative precedence of each of the operators. For example, does=20 > "a <- b c / d" mean "a <- (b c) / d" or "a <- b (c / d)"? And=20 > does "x <- y / z+" mean "x <- (y / z)+" or "x <- y / (z+)"?=20 > > {ja'o} One can intuit the intended order of operations by reading the=20 > file, with a knowledge of Lojban grammar, but {.e'u} more specificity=20 > would be helpful.=20 > > > le preti xi re pi'e pa zo'u:=20 > > {za'a} I noticed that you handle elidible terminators differently in=20 > the PEG than in the EBNF. For example, in the EBNF, one of the=20 > alternatives for sumti-6_97 is:=20 > > LI # mex /LOhO#/=20 > > But in the PEG, it's:=20 > > li_clause <- LI_clause free* mex LOhO_elidible free*=20 > > {ja'o} In the EBNF, the free* are only allowed if the terminator is=20 > present. But, in the PEG, it appears that free* can follow an elided=20 > terminator.=20 > > {.uanai} What's the rationale for doing this? How do you determine to=20 > which construct the free modifiers apply when the terminator is=20 > omitted?=20 > > > le preti xi re pi'e re zo'u:=20 > > {za'a} There is at least one place in the PEG where the PEG grammar=20 > differs from the EBNF (and the YACC). For example:=20 > > #stag =3D simple_tense_modal ((jek / joik) simple_tense_modal)*=20 > stag <- simple_tense_modal ((jek / joik) simple_tense_modal)* /=20 > tense_modal (joik_jek tense_modal)*=20 > > {ja'o} The PEG rule which would comply with the EBNF grammar is=20 > commented-out, and replaced with a rule which allows full "tag"s=20 > (i.e., full {fi'o} clauses) anywhere that only simple tags ("stag"s)=20 > used to be allowed. This changes (by expanding it) the set of=20 > utterances that would represent grammatical Lojban text.=20 > > {.uanai} What's the rationale for doing this?=20 > > {.e'u} If the intent is to implement an extension to Lojban's grammar,=20 > shouldn't this be split-out into a separate "experimental" PEG, like=20 > the others?=20 > > > le preti xi re pi'e ci zo'u:=20 > > {za'a} The PEG grammar, in several places, uses constructs like:=20 > > pehe_sa <- PEhE_clause (!PEhE_clause (sa_word / SA_clause !PEhE_clause))*= =20 > SA_clause=20 > > cehe_sa <- CEhE_clause (!CEhE_clause (sa_word / SA_clause !CEhE_clause))*= =20 > SA_clause=20 > > This is an idiom which appears repeatedly in the PEG, but there is no=20 > explanation for what this is doing or why. Must be buried in this mriste or the wiki. It's all BPFK discussions. =20 > > > {.e'u} Some higher-level explanation of how erasure words are handled=20 > would be helpful.=20 > > > le preti xi re pi'e vo zo'u:=20 > > {za'a} The PEG contains many non-terminals of the form=20 > "_clause", "_pre", "_post",=20 > "_sa", "pre_clause", and "post_clause", but there is no=20 > explanation of what this is doing or why.=20 > > {.e'u} Some higher-level explanation of the conventions used for=20 > naming the non-terminals, and how they interact, would be helpful.=20 > > .i ki'esai fa'o=20 > --=20 You received this message because you are subscribed to the Google Groups "= lojban" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to lojban+unsubscribe@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/= lojban/29e01fda-fe26-4fde-959f-119dadd9d82fn%40googlegroups.com. ------=_Part_235_1671223889.1611639943856 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

E= m segunda-feira, 25 de janeiro de 2021 =C3=A0s 00:35:34 UTC+3, scope845hlan= g343jbo@icebubble.org escreveu:
doi la camxes .a la masatos.xagiuaras .a la gleki .a la i= lmen .e zo'e

I'm looking at the PEG grammar for Lojban in the repository at
https://github.c= om/lojban/ilmentufa and have some questions...


le preti xi pa pi'e pa zo'u:

{za'a} The build system is written in Javascript, and doesn't use a
traditional build system, i.e. "make". {.uanai} Why

There are implementations in Java (the original camxes), in= python and iirc in haskell 

{.e'u} I think the system would be much clearer if it used standard
tools like make, sed, awk, and/or diff/patch. Because PEG files are
line-oriented, awk would probably be preferable to sed. Most systems
that have git (the format of the repository) will already have these
tools. Providing the alternative grammars as diffs would probably be
ideal, because diffs are easily read and understood by human beings,
and can be applied manually, without even needing any extra software
at all. (Any text editor would suffice.)

If I actually did use the JS script to build the alternative grammars= ,
the next thing I'd do would be to run a diff on them, anyway, to see
what changes are made in each of the grammars. So, using diff & = patch
would seem to be most sensible.


le preti xi pa pi'e re zo'u:

{za'a} The README.md is rather confusing and written in broken
English. {ru'a} Without intending any offense, it appears as if
English is not the author's native language.

{.e'u} I would be happy to discuss the README, off-list, to figure ou= t
what the author means, and help re-work the README to make it clearer
and more understandable.


le preti xi pa pi'e ci zo'u:

{za'a} The .peg files contain a brief description of the syntax of
parsing expressions:

...
# 3) Concatenation is expressed by juxtaposition with no operator = symbol.
# 4) / represents *ORDERED* alternation (choice). If the first
# option succeeds, the others will never be checked.
# 5) ? indicates that the element to the left is optional.
# 6) * represents optional repetition of the construct to the left= .
# 7) + represents one_or_more repetition of the construct to the l= eft.
...

{.uanai} However, it does not specify the order of operations or
relative precedence of each of the operators. For example, does
"a <- b c / d" mean "a <- (b c) / d" or "a <- b (c / d)"? A= nd
does "x <- y / z+" mean "x <- (y / z)+" or "x <- y / (z+)"?

{ja'o} One can intuit the intended order of operations by reading the
file, with a knowledge of Lojban grammar, but {.e'u} more specificity
would be helpful.


le preti xi re pi'e pa zo'u:

{za'a} I noticed that you handle elidible terminators differently in
the PEG than in the EBNF. For example, in the EBNF, one of the
alternatives for sumti-6_97 is:

LI # mex /LOhO#/

But in the PEG, it's:

li_clause <- LI_clause free* mex LOhO_elidible free*

{ja'o} In the EBNF, the free* are only allowed if the terminator is
present. But, in the PEG, it appears that free* can follow an elided
terminator.

{.uanai} What's the rationale for doing this? How do you determine t= o
which construct the free modifiers apply when the terminator is
omitted?


le preti xi re pi'e re zo'u:

{za'a} There is at least one place in the PEG where the PEG grammar
differs from the EBNF (and the YACC). For example:

#stag =3D simple_tense_modal ((jek / joik) simple_tense_modal)*
stag <- simple_tense_modal ((jek / joik) simple_tense_modal)* / = tense_modal (joik_jek tense_modal)*

{ja'o} The PEG rule which would comply with the EBNF grammar is
commented-out, and replaced with a rule which allows full "tag"s
(i.e., full {fi'o} clauses) anywhere that only simple tags ("stag"s)
used to be allowed. This changes (by expanding it) the set of
utterances that would represent grammatical Lojban text.

{.uanai} What's the rationale for doing this?

{.e'u} If the intent is to implement an extension to Lojban's grammar= ,
shouldn't this be split-out into a separate "experimental" PEG, like
the others?


le preti xi re pi'e ci zo'u:

{za'a} The PEG grammar, in several places, uses constructs like:

pehe_sa <- PEhE_clause (!PEhE_clause (sa_word / SA_clause !PEhE_= clause))* SA_clause

cehe_sa <- CEhE_clause (!CEhE_clause (sa_word / SA_clause !CEhE_= clause))* SA_clause

This is an idiom which appears repeatedly in the PEG, but there is no
explanation for what this is doing or why.



Must be buried in this mriste or the wi= ki. It's all BPFK discussions.  


{.e'u} Some higher-level explanation of how erasure words are handled
would be helpful.


le preti xi re pi'e vo zo'u:

{za'a} The PEG contains many non-terminals of the form
"<someword>_clause", "<someword>_pre", "<someword>_= post",
"<someword>_sa", "pre_clause", and "post_clause", but there is = no
explanation of what this is doing or why.

{.e'u} Some higher-level explanation of the conventions used for
naming the non-terminals, and how they interact, would be helpful.

.i ki'esai fa'o

--
You received this message because you are subscribed to the Google Groups &= quot;lojban" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to lojban+unsub= scribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/l= ojban/29e01fda-fe26-4fde-959f-119dadd9d82fn%40googlegroups.com.
------=_Part_235_1671223889.1611639943856-- ------=_Part_234_517993338.1611639943856--