Sender: lojban@googlegroups.com
Date: Mon, 27 Apr 2020 16:31:29 -0700 (PDT)
From: mukti <shunpiker@gmail.com>
To: lojban <lojban@googlegroups.com>
Message-Id: <de38a6eb-d41c-4275-a293-3e9a47aefd28@googlegroups.com>
In-Reply-To: <b1628c0f-f40b-4f65-90b2-594dbfee1652@googlegroups.com>
References: <86zhbyh1om.fsf@cmarib.ramside>
 <54430312-17f8-bbcc-eb95-c6f3aedfc046@gmail.com>
 <868sjeoga3.fsf@cmarib.ramside>
 <d7fc972d-696c-2e77-5e24-af41830645ef@lojban.org>
 <86k12m7ohg.fsf@cmarib.ramside>
 <33fb11ad-6aa7-47be-adc5-049d9f6670a9@googlegroups.com>
 <86o8rvbdd1.fsf@cmarib.ramside>
 <331e6b40-73bc-4597-bccd-2e7b1028cba7@googlegroups.com>
 <867dyikkxw.fsf@cmarib.ramside>
 <693d3c80-9001-0a4d-051a-dfee64f8a984@lojban.org>
 <86tv181vbz.fsf@cmarib.ramside>
 <b1628c0f-f40b-4f65-90b2-594dbfee1652@googlegroups.com>
Subject: [lojban] Re: Where is the latest/official PEG grammar?
MIME-Version: 1.0
Content-Type: multipart/mixed; 
	boundary="----=_Part_2241_1552266057.1588030289657"
Reply-To: lojban@googlegroups.com
Precedence: list
Mailing-list: list lojban@googlegroups.com; contact lojban+owners@googlegroups.com
X-Spam_score: -2.6
X-Spam_score_int: -25
X-Spam_bar: --

------=_Part_2241_1552266057.1588030289657
Content-Type: multipart/alternative; 
	boundary="----=_Part_2242_257847238.1588030289658"

------=_Part_2242_257847238.1588030289658
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

I want to address a few of scope845's questions and follow-ups.

First, the question of what is "official" is a sensitive one. There has=20
been a tendency in the lojban community to reject linguistic prescription=
=20
(making assertions about how people *should* use the language) in favor of=
=20
description: How are people actually using the language?

The YACC parser which is represented in the first edition of The Complete=
=20
Lojban Language has been formally recognized by LLG. It has not, to the=20
best of my knowledge, been actively maintained in some time, and does not=
=20
accept some more recent developments in lojban usage.

For the last 15 years or so, there has been more development activity in=20
the camxes line of parsers. This started with a Java/PEG parser by Robin=20
Lee Powell that included morphological reforms spearheaded by Jorge=20
Llambias. Ten years later, another significant step was taken when Masato=
=20
Hagiwara ported the PEG grammar to JavaScript. The best maintained=20
descendant of this parser is the Ilmentufa parser linked above. It fixes=20
bugs in the original parser and adds support for some varieties of usage.

One of the challenges in developing parsers-- and please someone correct me=
=20
if this is no longer true-- is that while camxes established a corpus of=20
texts which it expects to be acceptable, this practice hasn't always been=
=20
followed in subsequent parsers, and work to establish a parser-independent=
=20
AST for lojban has yet to be done: Even when you can verify that parsers=20
accept or reject the same texts, it's less certain that they are analyzing=
=20
those texts in the same way.

Anyway, scope845, if you are interested in pushing lojban parsers forward,=
=20
there's a lot of work to be done, and I think that you'll find people are=
=20
receptive to help doing that work.

On Saturday, April 25, 2020 at 3:33:45 AM UTC-4, Gleki Arxokuna wrote:
>
>
>
> Em s=C3=A1bado, 25 de abril de 2020 02:11:28 UTC+3, scope845h...@icebubbl=
e.org=20
> <javascript:> escreveu:
>>
>> Bob LeChevalier <loj...@lojban.org <javascript:>> writes:=20
>>
>> > On 4/14/2020 1:59 PM, scope845h...@icebubble.org <javascript:> wrote:=
=20
>> >> Gleki Arxokuna <gleki....@gmail.com <javascript:>> writes:=20
>>
>> >> None of what you have writen here makes any sense to me.  What do you=
=20
>> >> mean?=20
>> >=20
>> > As I said in my other answer (which I seem to have been sending only=
=20
>> > to you and not to the list, so I will continue that way), the official=
=20
>>
>> Hm.  That reply of mine wasn't address to you, it was addressed to Gleki=
=20
>> Arxokuna <gleki....@gmail.com <javascript:>>.=20
>>
>> >> yet we still don't have a complete grammar=20
>> >=20
>> > The official YACC grammar in CLL is considered complete.=20
>>
>> I realize that the YACC is "considered" authoritative, but it is not=20
>> complete.  For starters, it requires a separate lexer.  Neither the=20
>> lexer nor parser are usable unless you're in an environment where you=20
>> can run code written in C.  And, if you do get them to run, the results=
=20
>> are not correct.  Neither elidable terminators nor magic words are=20
>> handled correctly, and there is no formal specification (just narrative=
=20
>> descriptions in the CLL) for how they should work.  For example, I have=
=20
>> yet to see a parser which handles SA correctly.=20
>>
>> > I don't understand any PEG grammar; it is gobbledygook to me.=20
>>
>> PEG is fairly straightforward.  You just have to learn the operators=20
>> used in the parsing expressions, and their precedences.=20
>>
>> > If a PEG formalization cannot be easily used by a real human being to=
=20
>> > learn and use the language, more easily than the official YACC=20
>> > version, the PEG formalization is pretty much useless.=20
>>
>> > But there is little real value in a PEG grammar that is merely=20
>> > identical to the YACC specification, with no added functionality,=20
>> > which is why provable equivalence isn't important enough to bother=20
>> > with.=20
>>
>> No, no, there would be HUGE value in it!  A PEG formalization would be=
=20
>> useful because (1) it would, finally, be a complete specification of=20
>> Lojban orthography, morphology, and grammar;
>
>
> Lojban grammar can NOT be expressed via PEG. PEG is not powerful enough.=
=20
>
>> (2) it would, finally,=20
>> provide proof that Lojban is unambiguous; (3) it would be readily=20
>> portable to any computing system, using any programming language; and=20
>> (4) it would provide parse trees that could be used to implement a=20
>> variety of useful tools for processing Lojban text.=20
>>
>> Proving equivalence between the PEG and the YACC is vitally important=20
>> because (A) there should be some way to be sure that PEG-based tools are=
=20
>> designed and implemented correctly; and (B) if a PEG formulation is ever=
=20
>> adopted as the official grammar, we would want to make sure it's fully=
=20
>> compatible with the historical YACC version of the grammar.=20
>>
>> > It might be nice to have a lexer/parser that can operate on based on=
=20
>> > an official formal grammar but not at the expense of someone being=20
>> > able to actually use the formal grammar to learn the language.=20
>>
>> > I don't even like E-BNF, which many people apparently prefer to the=20
>> > YACC grammar.=20
>>
>> The E-BNF is quite readable, although the E-BNF in the CLL has MANY=20
>> errors in it.  I find the YACC almost completely unintelligible.  I only=
=20
>> refer to the YACC when verifying or making corrections to the E-BNF.=20
>>
>> > There have been attempts to formalize the morphpology as an algorithm,=
=20
>> > which my wife worked on with a couple other people.=20
>>
>> Yes, I know.  I remember talking with her about it at Logfest in 2006.=
=20
>> Now 14 years later, I still haven't figured out what Lojban's morphology=
=20
>> rules are supposed to be.  That's actually why I'm reading the PEG: to=
=20
>> figure out Lojban's morphology rules.=20
>>
>> > started playing with PEG grammars.  Again, Nora's algorithm was "good=
=20
>> > enough" in that it completely specified the rules, even if it didn't=
=20
>> > match any formalization scheme.=20
>>
>> What we have isn't good enough, because it's an incomplete specification=
=20
>> of Lojban morphology.  Aside from the PEG, there is no way to distiguish=
=20
>> fu'ivla from lujvo, for instance.  There are a lot of constructs which=
=20
>> could be classified either way, and the CLL doesn't provide enough rules=
=20
>> to disambiguate those cases.=20
>>
>> > In the early oughts, we started trying to formalize the morphology in=
=20
>> > a fixed algorithm, NOT in any schema such as YACC or PEG or even BNF,=
=20
>> > and we reached a more or less satisfactory conclusion, though the=20
>>
>> Where might this alogrithm be documented?  (If you're referring to the=
=20
>> lujvo-making alogrithm printed in the CLL, it's not complete.)=20
>>
>> > But no one was ever satisfied with any particular formalization, and=
=20
>> > it has never been a big priority.=20
>>
>> I don't understand how formalizing the morphology CAN'T be an important=
=20
>> priority; it's essential to proving the unambiguity of the language.=20
>>
>> > result was never officially approved because people were pursuing the=
=20
>> > PEG approach by then.  Nora wrote a simplistic Turbo-Pascal program to=
=20
>> > verify that algorithm matched human understanding (which is the=20
>>
>> Pascal code is not readily usable in modern computing environments, and=
=20
>> can't readily be translated into rules which ARE useful in modern=20
>> software.  Nor is it particularly readable, if one is trying to learn=20
>> (decipher) the Lojban morphology rules.=20
>>
>> > Who is waiting?  There's probably no real market for anything more=20
>> > sophisticated than we have now.  And the approval of "dotside" would=
=20
>>
>> Everyone, I think?  That's why there's so much interest in PEG=20
>> formalizations.  What we have now is a collection of toys.  What we want=
=20
>> is a collection of tools.  So far, all of our "tools" are really just=20
>> assorted collections of hacks: cobbled-together bits of software which=
=20
>> implement approximations of Lojban, each implemented for/in its own very=
=20
>> specific computing environment.=20
>>
>> BTW, thank for your post RE: Jeff Prothero.=20
>>
>

--=20
You received this message because you are subscribed to the Google Groups "=
lojban" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to lojban+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/=
lojban/de38a6eb-d41c-4275-a293-3e9a47aefd28%40googlegroups.com.

------=_Part_2242_257847238.1588030289658
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div>I want to address a few of scope845&#39;s questions a=
nd follow-ups.</div><div><br></div><div>First, the question of what is &quo=
t;official&quot; is a sensitive one. There has been a tendency in the lojba=
n community to reject linguistic prescription (making assertions about how =
people <i>should</i>=C2=A0use the language) in favor of description: How ar=
e people actually using the language?</div><div><br></div><div>The YACC par=
ser which is represented in the first edition of The Complete Lojban Langua=
ge has been formally recognized by LLG. It has not, to the best of my knowl=
edge, been actively maintained in some time, and does not accept some more =
recent developments in lojban usage.</div><div><br></div><div>For the last =
15 years or so, there has been more development activity in the camxes line=
 of parsers. This started with a Java/PEG parser by Robin Lee Powell that i=
ncluded morphological reforms spearheaded by Jorge Llambias. Ten years late=
r, another significant step was taken when Masato Hagiwara ported the PEG g=
rammar to JavaScript. The best maintained descendant of this parser is the =
Ilmentufa parser linked above. It fixes bugs in the original parser and add=
s support for some varieties of usage.</div><div><div><br></div></div><div>=
One of the challenges in developing parsers-- and please someone correct me=
 if this is no longer true-- is that while camxes established a corpus of t=
exts which it expects to be acceptable, this practice hasn&#39;t always bee=
n followed in subsequent parsers, and work to establish a parser-independen=
t AST for lojban has yet to be done: Even when you can verify that parsers =
accept or reject the same texts, it&#39;s less certain that they are analyz=
ing those texts in the same way.</div><div><br></div><div>Anyway, scope845,=
 if you are interested in pushing lojban parsers forward, there&#39;s a lot=
 of work to be done, and I think that you&#39;ll find people are receptive =
to help doing that work.</div><div><br></div>On Saturday, April 25, 2020 at=
 3:33:45 AM UTC-4, Gleki Arxokuna wrote:<blockquote class=3D"gmail_quote" s=
tyle=3D"margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-le=
ft: 1ex;"><div dir=3D"ltr"><br><br>Em s=C3=A1bado, 25 de abril de 2020 02:1=
1:28 UTC+3, <a href=3D"javascript:" target=3D"_blank" gdf-obfuscated-mailto=
=3D"VU6UQBHuAwAJ" rel=3D"nofollow" onmousedown=3D"this.href=3D&#39;javascri=
pt:&#39;;return true;" onclick=3D"this.href=3D&#39;javascript:&#39;;return =
true;">scope845h...@icebubble.<wbr>org</a>  escreveu:<blockquote class=3D"g=
mail_quote" style=3D"margin:0;margin-left:0.8ex;border-left:1px #ccc solid;=
padding-left:1ex">Bob LeChevalier &lt;<a href=3D"javascript:" rel=3D"nofoll=
ow" target=3D"_blank" gdf-obfuscated-mailto=3D"VU6UQBHuAwAJ" onmousedown=3D=
"this.href=3D&#39;javascript:&#39;;return true;" onclick=3D"this.href=3D=
9;javascript:&#39;;return true;">loj...@lojban.org</a>&gt; writes:
<br>
<br>&gt; On 4/14/2020 1:59 PM, <a href=3D"javascript:" rel=3D"nofollow" tar=
get=3D"_blank" gdf-obfuscated-mailto=3D"VU6UQBHuAwAJ" onmousedown=3D"this.h=
ref=3D&#39;javascript:&#39;;return true;" onclick=3D"this.href=3D&#39;javas=
cript:&#39;;return true;">scope845h...@icebubble.<wbr>org</a> wrote:
<br>&gt;&gt; Gleki Arxokuna &lt;<a href=3D"javascript:" rel=3D"nofollow" ta=
rget=3D"_blank" gdf-obfuscated-mailto=3D"VU6UQBHuAwAJ" onmousedown=3D"this.=
href=3D&#39;javascript:&#39;;return true;" onclick=3D"this.href=3D&#39;java=
script:&#39;;return true;">gleki....@gmail.com</a>&gt; writes:
<br>
<br>&gt;&gt; None of what you have writen here makes any sense to me. =C2=
=A0What do you
<br>&gt;&gt; mean?
<br>&gt;
<br>&gt; As I said in my other answer (which I seem to have been sending on=
ly
<br>&gt; to you and not to the list, so I will continue that way), the offi=
cial
<br>
<br>Hm. =C2=A0That reply of mine wasn&#39;t address to you, it was addresse=
d to Gleki
<br>Arxokuna &lt;<a href=3D"javascript:" rel=3D"nofollow" target=3D"_blank"=
 gdf-obfuscated-mailto=3D"VU6UQBHuAwAJ" onmousedown=3D"this.href=3D&#39;jav=
ascript:&#39;;return true;" onclick=3D"this.href=3D&#39;javascript:&#39;;re=
turn true;">gleki....@gmail.com</a>&gt;.
<br>
<br>&gt;&gt; yet we still don&#39;t have a complete grammar
<br>&gt;
<br>&gt; The official YACC grammar in CLL is considered complete.
<br>
<br>I realize that the YACC is &quot;considered&quot; authoritative, but it=
 is not
<br>complete. =C2=A0For starters, it requires a separate lexer. =C2=A0Neith=
er the
<br>lexer nor parser are usable unless you&#39;re in an environment where y=
ou
<br>can run code written in C. =C2=A0And, if you do get them to run, the re=
sults
<br>are not correct. =C2=A0Neither elidable terminators nor magic words are
<br>handled correctly, and there is no formal specification (just narrative
<br>descriptions in the CLL) for how they should work. =C2=A0For example, I=
 have
<br>yet to see a parser which handles SA correctly.
<br>
<br>&gt; I don&#39;t understand any PEG grammar; it is gobbledygook to me.
<br>
<br>PEG is fairly straightforward. =C2=A0You just have to learn the operato=
rs
<br>used in the parsing expressions, and their precedences.
<br>
<br>&gt; If a PEG formalization cannot be easily used by a real human being=
 to
<br>&gt; learn and use the language, more easily than the official YACC
<br>&gt; version, the PEG formalization is pretty much useless.
<br>
<br>&gt; But there is little real value in a PEG grammar that is merely
<br>&gt; identical to the YACC specification, with no added functionality,
<br>&gt; which is why provable equivalence isn&#39;t important enough to bo=
ther
<br>&gt; with.
<br>
<br>No, no, there would be HUGE value in it! =C2=A0A PEG formalization woul=
d be
<br>useful because (1) it would, finally, be a complete specification of
<br>Lojban orthography, morphology, and grammar;</blockquote><div><br></div=
><div>Lojban grammar can NOT be expressed via PEG. PEG is not powerful enou=
gh.=C2=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0;margin-l=
eft:0.8ex;border-left:1px #ccc solid;padding-left:1ex"> (2) it would, final=
ly,
<br>provide proof that Lojban is unambiguous; (3) it would be readily
<br>portable to any computing system, using any programming language; and
<br>(4) it would provide parse trees that could be used to implement a
<br>variety of useful tools for processing Lojban text.
<br>
<br>Proving equivalence between the PEG and the YACC is vitally important
<br>because (A) there should be some way to be sure that PEG-based tools ar=
e
<br>designed and implemented correctly; and (B) if a PEG formulation is eve=
r
<br>adopted as the official grammar, we would want to make sure it&#39;s fu=
lly
<br>compatible with the historical YACC version of the grammar.
<br>
<br>&gt; It might be nice to have a lexer/parser that can operate on based =
on
<br>&gt; an official formal grammar but not at the expense of someone being
<br>&gt; able to actually use the formal grammar to learn the language.
<br>
<br>&gt; I don&#39;t even like E-BNF, which many people apparently prefer t=
o the
<br>&gt; YACC grammar.
<br>
<br>The E-BNF is quite readable, although the E-BNF in the CLL has MANY
<br>errors in it. =C2=A0I find the YACC almost completely unintelligible. =
=C2=A0I only
<br>refer to the YACC when verifying or making corrections to the E-BNF.
<br>
<br>&gt; There have been attempts to formalize the morphpology as an algori=
thm,
<br>&gt; which my wife worked on with a couple other people.
<br>
<br>Yes, I know. =C2=A0I remember talking with her about it at Logfest in 2=
006.
<br>Now 14 years later, I still haven&#39;t figured out what Lojban&#39;s m=
orphology
<br>rules are supposed to be. =C2=A0That&#39;s actually why I&#39;m reading=
 the PEG: to
<br>figure out Lojban&#39;s morphology rules.
<br>
<br>&gt; started playing with PEG grammars. =C2=A0Again, Nora&#39;s algorit=
hm was &quot;good
<br>&gt; enough&quot; in that it completely specified the rules, even if it=
 didn&#39;t
<br>&gt; match any formalization scheme.
<br>
<br>What we have isn&#39;t good enough, because it&#39;s an incomplete spec=
ification
<br>of Lojban morphology. =C2=A0Aside from the PEG, there is no way to dist=
iguish
<br>fu&#39;ivla from lujvo, for instance. =C2=A0There are a lot of construc=
ts which
<br>could be classified either way, and the CLL doesn&#39;t provide enough =
rules
<br>to disambiguate those cases.
<br>
<br>&gt; In the early oughts, we started trying to formalize the morphology=
 in
<br>&gt; a fixed algorithm, NOT in any schema such as YACC or PEG or even B=
NF,
<br>&gt; and we reached a more or less satisfactory conclusion, though the
<br>
<br>Where might this alogrithm be documented? =C2=A0(If you&#39;re referrin=
g to the
<br>lujvo-making alogrithm printed in the CLL, it&#39;s not complete.)
<br>
<br>&gt; But no one was ever satisfied with any particular formalization, a=
nd
<br>&gt; it has never been a big priority.
<br>
<br>I don&#39;t understand how formalizing the morphology CAN&#39;T be an i=
mportant
<br>priority; it&#39;s essential to proving the unambiguity of the language=
.
<br>
<br>&gt; result was never officially approved because people were pursuing =
the
<br>&gt; PEG approach by then. =C2=A0Nora wrote a simplistic Turbo-Pascal p=
rogram to
<br>&gt; verify that algorithm matched human understanding (which is the
<br>
<br>Pascal code is not readily usable in modern computing environments, and
<br>can&#39;t readily be translated into rules which ARE useful in modern
<br>software. =C2=A0Nor is it particularly readable, if one is trying to le=
arn
<br>(decipher) the Lojban morphology rules.
<br>
<br>&gt; Who is waiting? =C2=A0There&#39;s probably no real market for anyt=
hing more
<br>&gt; sophisticated than we have now. =C2=A0And the approval of &quot;do=
tside&quot; would
<br>
<br>Everyone, I think? =C2=A0That&#39;s why there&#39;s so much interest in=
 PEG
<br>formalizations. =C2=A0What we have now is a collection of toys. =C2=A0W=
hat we want
<br>is a collection of tools. =C2=A0So far, all of our &quot;tools&quot; ar=
e really just
<br>assorted collections of hacks: cobbled-together bits of software which
<br>implement approximations of Lojban, each implemented for/in its own ver=
y
<br>specific computing environment.
<br>
<br>BTW, thank for your post RE: Jeff Prothero.
<br></blockquote></div></blockquote></div>

<p></p>

-- <br />
You received this message because you are subscribed to the Google Groups &=
quot;lojban&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:lojban+unsubscribe@googlegroups.com">lojban+unsub=
scribe@googlegroups.com</a>.<br />
To view this discussion on the web visit <a href=3D"https://groups.google.c=
om/d/msgid/lojban/de38a6eb-d41c-4275-a293-3e9a47aefd28%40googlegroups.com?u=
tm_medium=3Demail&utm_source=3Dfooter">https://groups.google.com/d/msgid/lo=
jban/de38a6eb-d41c-4275-a293-3e9a47aefd28%40googlegroups.com</a>.<br />

------=_Part_2242_257847238.1588030289658--

------=_Part_2241_1552266057.1588030289657--