Received: from mail-yb1-f183.google.com ([209.85.219.183]:57009) by stodi.digitalkingdom.org with esmtps (TLSv1.3:TLS_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1jTDDt-0006nG-JL for lojban-list-archive@lojban.org; Mon, 27 Apr 2020 16:31:40 -0700 Received: by mail-yb1-f183.google.com with SMTP id k14sf5809649ybp.23 for ; Mon, 27 Apr 2020 16:31:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:date:from:to:message-id:in-reply-to:references:subject :mime-version:x-original-sender:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:list-subscribe :list-unsubscribe; bh=9aVRiu5Nh8ZTsUvyMMFlV51mp5fxdlZQPg0zBkb3yWE=; b=pW2L7rLgOEj2G0N71H3q28ugxcqGYQ4i3AlHmNMBa+W55jr6SjsJzOGyebQbQaDZKu pF0llTaVd4KaT8+FyEySFPhxtou+t7fj6EYOju6dRk06ZObMqHUKuKUzWzkYoL8I1zXD o73d66009k02H6wUQqT1mwe0W070C2SUbhlHjn7UBUEgxEUtsTdi8goUh1TC0C5mgRlg T5ji06j2PaJdhehU9wB/bIkQ4zoLu8pzdIvOkASUNzTEZUopUkdRkDET9iFPYAm3Wvmr l99ENIbcZD5Hj6pFIi2Ev4hcCrfZ2PjB/LzdaMy6heiLH192wR8Qt3yZxg/fbSjzosQ3 laNw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:message-id:in-reply-to:references:subject:mime-version :x-original-sender:reply-to:precedence:mailing-list:list-id :list-post:list-help:list-archive:list-subscribe:list-unsubscribe; bh=9aVRiu5Nh8ZTsUvyMMFlV51mp5fxdlZQPg0zBkb3yWE=; b=AagYljPCN1OFY828APlg/oh/jO+IKPIi8+ETjKOORVKvuBrTdCnsJuVWefZIIaOpFQ XSLp36KPoZX6QMvYBXMLLBlJW2xug+uHQ/jKuPRyA5cXWc7C4hNljCWAFkQKYoaGHOGY 5glK5LrQBcNB5SU3Z2AQF4Td3iTIdlWYPYWoZaZyyczgG/+ZsS1fWStPuDDHXC3Ng2jI 98wADBrXavQsA1XPfb/RB48w42yQy6YsFvN5w7TNSV/iAXOZwU7aQCjxlktu1Al4OQPJ O23X6mpZWhg47fvJU7VyM5FDGHf+MLRGjk0bLtbJ6FWxC9CUBBRLHv5DUEahdwYvHWRw HPAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:date:from:to:message-id:in-reply-to :references:subject:mime-version:x-original-sender:reply-to :precedence:mailing-list:list-id:x-spam-checked-in-group:list-post :list-help:list-archive:list-subscribe:list-unsubscribe; bh=9aVRiu5Nh8ZTsUvyMMFlV51mp5fxdlZQPg0zBkb3yWE=; b=GbdE80uiElNN8LPP0szmBL1proYoJjV0OhMygrixlYv4Oz562zT+ZIa+BEJwfNTFO0 50csSwOtiY3YLFifi2fONp1Za7zi0XlQW66KALI34pMXkEo8Orrvyj+icNdrFLpVVV0/ okdOE6T8V4gYlczAqvOcgzfPa/TIEBFi6CbmuQbkbhoGgWows/YNITtHppGJnGeha9L6 b0kqXaRTUpAaMoUGJOKQIa/+x60lEY+OT5/vQWW5ZfvvFNeg8Yf2/YbcMLpV04HRN69d WbbBTnUTEBT1HUrtmuyjq1VIz95i6CA3Ukekt9bl7979L2McAGsOy3PFNmFqYsKNPDsL 9f+A== Sender: lojban@googlegroups.com X-Gm-Message-State: AGi0PuaaYheoJCLwh/T/08ZQJYf2vjWlAVmEhOJO/v8LEiFJ6mEj+qJF vpHf+IPXNZLoQSulRdQFsgs= X-Google-Smtp-Source: APiQypItXcKMI40TEVz3Zb2phCDfppV5r88oste+wVp7FOTF0pmxNlIpREGJlEdW9wZpkvRctZbl2A== X-Received: by 2002:a25:80c4:: with SMTP id c4mr41806631ybm.110.1588030291153; Mon, 27 Apr 2020 16:31:31 -0700 (PDT) X-BeenThere: lojban@googlegroups.com Received: by 2002:a5b:ec8:: with SMTP id a8ls3161599ybs.4.gmail; Mon, 27 Apr 2020 16:31:30 -0700 (PDT) X-Received: by 2002:a25:2315:: with SMTP id j21mr41826497ybj.8.1588030290257; Mon, 27 Apr 2020 16:31:30 -0700 (PDT) Date: Mon, 27 Apr 2020 16:31:29 -0700 (PDT) From: mukti To: lojban Message-Id: In-Reply-To: References: <86zhbyh1om.fsf@cmarib.ramside> <54430312-17f8-bbcc-eb95-c6f3aedfc046@gmail.com> <868sjeoga3.fsf@cmarib.ramside> <86k12m7ohg.fsf@cmarib.ramside> <33fb11ad-6aa7-47be-adc5-049d9f6670a9@googlegroups.com> <86o8rvbdd1.fsf@cmarib.ramside> <331e6b40-73bc-4597-bccd-2e7b1028cba7@googlegroups.com> <867dyikkxw.fsf@cmarib.ramside> <693d3c80-9001-0a4d-051a-dfee64f8a984@lojban.org> <86tv181vbz.fsf@cmarib.ramside> Subject: [lojban] Re: Where is the latest/official PEG grammar? MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_2241_1552266057.1588030289657" X-Original-Sender: shunpiker@gmail.com Reply-To: lojban@googlegroups.com Precedence: list Mailing-list: list lojban@googlegroups.com; contact lojban+owners@googlegroups.com List-ID: X-Spam-Checked-In-Group: lojban@googlegroups.com X-Google-Group-Id: 1004133512417 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , X-Spam-Score: -2.6 (--) X-Spam_score: -2.6 X-Spam_score_int: -25 X-Spam_bar: -- ------=_Part_2241_1552266057.1588030289657 Content-Type: multipart/alternative; boundary="----=_Part_2242_257847238.1588030289658" ------=_Part_2242_257847238.1588030289658 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable I want to address a few of scope845's questions and follow-ups. First, the question of what is "official" is a sensitive one. There has=20 been a tendency in the lojban community to reject linguistic prescription= =20 (making assertions about how people *should* use the language) in favor of= =20 description: How are people actually using the language? The YACC parser which is represented in the first edition of The Complete= =20 Lojban Language has been formally recognized by LLG. It has not, to the=20 best of my knowledge, been actively maintained in some time, and does not= =20 accept some more recent developments in lojban usage. For the last 15 years or so, there has been more development activity in=20 the camxes line of parsers. This started with a Java/PEG parser by Robin=20 Lee Powell that included morphological reforms spearheaded by Jorge=20 Llambias. Ten years later, another significant step was taken when Masato= =20 Hagiwara ported the PEG grammar to JavaScript. The best maintained=20 descendant of this parser is the Ilmentufa parser linked above. It fixes=20 bugs in the original parser and adds support for some varieties of usage. One of the challenges in developing parsers-- and please someone correct me= =20 if this is no longer true-- is that while camxes established a corpus of=20 texts which it expects to be acceptable, this practice hasn't always been= =20 followed in subsequent parsers, and work to establish a parser-independent= =20 AST for lojban has yet to be done: Even when you can verify that parsers=20 accept or reject the same texts, it's less certain that they are analyzing= =20 those texts in the same way. Anyway, scope845, if you are interested in pushing lojban parsers forward,= =20 there's a lot of work to be done, and I think that you'll find people are= =20 receptive to help doing that work. On Saturday, April 25, 2020 at 3:33:45 AM UTC-4, Gleki Arxokuna wrote: > > > > Em s=C3=A1bado, 25 de abril de 2020 02:11:28 UTC+3, scope845h...@icebubbl= e.org=20 > escreveu: >> >> Bob LeChevalier > writes:=20 >> >> > On 4/14/2020 1:59 PM, scope845h...@icebubble.org wrote:= =20 >> >> Gleki Arxokuna > writes:=20 >> >> >> None of what you have writen here makes any sense to me. What do you= =20 >> >> mean?=20 >> >=20 >> > As I said in my other answer (which I seem to have been sending only= =20 >> > to you and not to the list, so I will continue that way), the official= =20 >> >> Hm. That reply of mine wasn't address to you, it was addressed to Gleki= =20 >> Arxokuna >.=20 >> >> >> yet we still don't have a complete grammar=20 >> >=20 >> > The official YACC grammar in CLL is considered complete.=20 >> >> I realize that the YACC is "considered" authoritative, but it is not=20 >> complete. For starters, it requires a separate lexer. Neither the=20 >> lexer nor parser are usable unless you're in an environment where you=20 >> can run code written in C. And, if you do get them to run, the results= =20 >> are not correct. Neither elidable terminators nor magic words are=20 >> handled correctly, and there is no formal specification (just narrative= =20 >> descriptions in the CLL) for how they should work. For example, I have= =20 >> yet to see a parser which handles SA correctly.=20 >> >> > I don't understand any PEG grammar; it is gobbledygook to me.=20 >> >> PEG is fairly straightforward. You just have to learn the operators=20 >> used in the parsing expressions, and their precedences.=20 >> >> > If a PEG formalization cannot be easily used by a real human being to= =20 >> > learn and use the language, more easily than the official YACC=20 >> > version, the PEG formalization is pretty much useless.=20 >> >> > But there is little real value in a PEG grammar that is merely=20 >> > identical to the YACC specification, with no added functionality,=20 >> > which is why provable equivalence isn't important enough to bother=20 >> > with.=20 >> >> No, no, there would be HUGE value in it! A PEG formalization would be= =20 >> useful because (1) it would, finally, be a complete specification of=20 >> Lojban orthography, morphology, and grammar; > > > Lojban grammar can NOT be expressed via PEG. PEG is not powerful enough.= =20 > >> (2) it would, finally,=20 >> provide proof that Lojban is unambiguous; (3) it would be readily=20 >> portable to any computing system, using any programming language; and=20 >> (4) it would provide parse trees that could be used to implement a=20 >> variety of useful tools for processing Lojban text.=20 >> >> Proving equivalence between the PEG and the YACC is vitally important=20 >> because (A) there should be some way to be sure that PEG-based tools are= =20 >> designed and implemented correctly; and (B) if a PEG formulation is ever= =20 >> adopted as the official grammar, we would want to make sure it's fully= =20 >> compatible with the historical YACC version of the grammar.=20 >> >> > It might be nice to have a lexer/parser that can operate on based on= =20 >> > an official formal grammar but not at the expense of someone being=20 >> > able to actually use the formal grammar to learn the language.=20 >> >> > I don't even like E-BNF, which many people apparently prefer to the=20 >> > YACC grammar.=20 >> >> The E-BNF is quite readable, although the E-BNF in the CLL has MANY=20 >> errors in it. I find the YACC almost completely unintelligible. I only= =20 >> refer to the YACC when verifying or making corrections to the E-BNF.=20 >> >> > There have been attempts to formalize the morphpology as an algorithm,= =20 >> > which my wife worked on with a couple other people.=20 >> >> Yes, I know. I remember talking with her about it at Logfest in 2006.= =20 >> Now 14 years later, I still haven't figured out what Lojban's morphology= =20 >> rules are supposed to be. That's actually why I'm reading the PEG: to= =20 >> figure out Lojban's morphology rules.=20 >> >> > started playing with PEG grammars. Again, Nora's algorithm was "good= =20 >> > enough" in that it completely specified the rules, even if it didn't= =20 >> > match any formalization scheme.=20 >> >> What we have isn't good enough, because it's an incomplete specification= =20 >> of Lojban morphology. Aside from the PEG, there is no way to distiguish= =20 >> fu'ivla from lujvo, for instance. There are a lot of constructs which= =20 >> could be classified either way, and the CLL doesn't provide enough rules= =20 >> to disambiguate those cases.=20 >> >> > In the early oughts, we started trying to formalize the morphology in= =20 >> > a fixed algorithm, NOT in any schema such as YACC or PEG or even BNF,= =20 >> > and we reached a more or less satisfactory conclusion, though the=20 >> >> Where might this alogrithm be documented? (If you're referring to the= =20 >> lujvo-making alogrithm printed in the CLL, it's not complete.)=20 >> >> > But no one was ever satisfied with any particular formalization, and= =20 >> > it has never been a big priority.=20 >> >> I don't understand how formalizing the morphology CAN'T be an important= =20 >> priority; it's essential to proving the unambiguity of the language.=20 >> >> > result was never officially approved because people were pursuing the= =20 >> > PEG approach by then. Nora wrote a simplistic Turbo-Pascal program to= =20 >> > verify that algorithm matched human understanding (which is the=20 >> >> Pascal code is not readily usable in modern computing environments, and= =20 >> can't readily be translated into rules which ARE useful in modern=20 >> software. Nor is it particularly readable, if one is trying to learn=20 >> (decipher) the Lojban morphology rules.=20 >> >> > Who is waiting? There's probably no real market for anything more=20 >> > sophisticated than we have now. And the approval of "dotside" would= =20 >> >> Everyone, I think? That's why there's so much interest in PEG=20 >> formalizations. What we have now is a collection of toys. What we want= =20 >> is a collection of tools. So far, all of our "tools" are really just=20 >> assorted collections of hacks: cobbled-together bits of software which= =20 >> implement approximations of Lojban, each implemented for/in its own very= =20 >> specific computing environment.=20 >> >> BTW, thank for your post RE: Jeff Prothero.=20 >> > --=20 You received this message because you are subscribed to the Google Groups "= lojban" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to lojban+unsubscribe@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/= lojban/de38a6eb-d41c-4275-a293-3e9a47aefd28%40googlegroups.com. ------=_Part_2242_257847238.1588030289658 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I want to address a few of scope845's questions a= nd follow-ups.

First, the question of what is &quo= t;official" is a sensitive one. There has been a tendency in the lojba= n community to reject linguistic prescription (making assertions about how = people should=C2=A0use the language) in favor of description: How ar= e people actually using the language?

The YACC par= ser which is represented in the first edition of The Complete Lojban Langua= ge has been formally recognized by LLG. It has not, to the best of my knowl= edge, been actively maintained in some time, and does not accept some more = recent developments in lojban usage.

For the last = 15 years or so, there has been more development activity in the camxes line= of parsers. This started with a Java/PEG parser by Robin Lee Powell that i= ncluded morphological reforms spearheaded by Jorge Llambias. Ten years late= r, another significant step was taken when Masato Hagiwara ported the PEG g= rammar to JavaScript. The best maintained descendant of this parser is the = Ilmentufa parser linked above. It fixes bugs in the original parser and add= s support for some varieties of usage.

= One of the challenges in developing parsers-- and please someone correct me= if this is no longer true-- is that while camxes established a corpus of t= exts which it expects to be acceptable, this practice hasn't always bee= n followed in subsequent parsers, and work to establish a parser-independen= t AST for lojban has yet to be done: Even when you can verify that parsers = accept or reject the same texts, it's less certain that they are analyz= ing those texts in the same way.

Anyway, scope845,= if you are interested in pushing lojban parsers forward, there's a lot= of work to be done, and I think that you'll find people are receptive = to help doing that work.

On Saturday, April 25, 2020 at= 3:33:45 AM UTC-4, Gleki Arxokuna wrote:


Em s=C3=A1bado, 25 de abril de 2020 02:1= 1:28 UTC+3, scope845h...@icebubble.org escreveu:
Bob LeChevalier <loj...@lojban.org> writes:

> On 4/14/2020 1:59 PM, scope845h...@icebubble.org wrote:
>> Gleki Arxokuna <gleki....@gmail.com> writes:

>> None of what you have writen here makes any sense to me. =C2= =A0What do you
>> mean?
>
> As I said in my other answer (which I seem to have been sending on= ly
> to you and not to the list, so I will continue that way), the offi= cial

Hm. =C2=A0That reply of mine wasn't address to you, it was addresse= d to Gleki
Arxokuna <gleki....@gmail.com>.

>> yet we still don't have a complete grammar
>
> The official YACC grammar in CLL is considered complete.

I realize that the YACC is "considered" authoritative, but it= is not
complete. =C2=A0For starters, it requires a separate lexer. =C2=A0Neith= er the
lexer nor parser are usable unless you're in an environment where y= ou
can run code written in C. =C2=A0And, if you do get them to run, the re= sults
are not correct. =C2=A0Neither elidable terminators nor magic words are
handled correctly, and there is no formal specification (just narrative
descriptions in the CLL) for how they should work. =C2=A0For example, I= have
yet to see a parser which handles SA correctly.

> I don't understand any PEG grammar; it is gobbledygook to me.

PEG is fairly straightforward. =C2=A0You just have to learn the operato= rs
used in the parsing expressions, and their precedences.

> If a PEG formalization cannot be easily used by a real human being= to
> learn and use the language, more easily than the official YACC
> version, the PEG formalization is pretty much useless.

> But there is little real value in a PEG grammar that is merely
> identical to the YACC specification, with no added functionality,
> which is why provable equivalence isn't important enough to bo= ther
> with.

No, no, there would be HUGE value in it! =C2=A0A PEG formalization woul= d be
useful because (1) it would, finally, be a complete specification of
Lojban orthography, morphology, and grammar;

Lojban grammar can NOT be expressed via PEG. PEG is not powerful enou= gh.=C2=A0
(2) it would, final= ly,
provide proof that Lojban is unambiguous; (3) it would be readily
portable to any computing system, using any programming language; and
(4) it would provide parse trees that could be used to implement a
variety of useful tools for processing Lojban text.

Proving equivalence between the PEG and the YACC is vitally important
because (A) there should be some way to be sure that PEG-based tools ar= e
designed and implemented correctly; and (B) if a PEG formulation is eve= r
adopted as the official grammar, we would want to make sure it's fu= lly
compatible with the historical YACC version of the grammar.

> It might be nice to have a lexer/parser that can operate on based = on
> an official formal grammar but not at the expense of someone being
> able to actually use the formal grammar to learn the language.

> I don't even like E-BNF, which many people apparently prefer t= o the
> YACC grammar.

The E-BNF is quite readable, although the E-BNF in the CLL has MANY
errors in it. =C2=A0I find the YACC almost completely unintelligible. = =C2=A0I only
refer to the YACC when verifying or making corrections to the E-BNF.

> There have been attempts to formalize the morphpology as an algori= thm,
> which my wife worked on with a couple other people.

Yes, I know. =C2=A0I remember talking with her about it at Logfest in 2= 006.
Now 14 years later, I still haven't figured out what Lojban's m= orphology
rules are supposed to be. =C2=A0That's actually why I'm reading= the PEG: to
figure out Lojban's morphology rules.

> started playing with PEG grammars. =C2=A0Again, Nora's algorit= hm was "good
> enough" in that it completely specified the rules, even if it= didn't
> match any formalization scheme.

What we have isn't good enough, because it's an incomplete spec= ification
of Lojban morphology. =C2=A0Aside from the PEG, there is no way to dist= iguish
fu'ivla from lujvo, for instance. =C2=A0There are a lot of construc= ts which
could be classified either way, and the CLL doesn't provide enough = rules
to disambiguate those cases.

> In the early oughts, we started trying to formalize the morphology= in
> a fixed algorithm, NOT in any schema such as YACC or PEG or even B= NF,
> and we reached a more or less satisfactory conclusion, though the

Where might this alogrithm be documented? =C2=A0(If you're referrin= g to the
lujvo-making alogrithm printed in the CLL, it's not complete.)

> But no one was ever satisfied with any particular formalization, a= nd
> it has never been a big priority.

I don't understand how formalizing the morphology CAN'T be an i= mportant
priority; it's essential to proving the unambiguity of the language= .

> result was never officially approved because people were pursuing = the
> PEG approach by then. =C2=A0Nora wrote a simplistic Turbo-Pascal p= rogram to
> verify that algorithm matched human understanding (which is the

Pascal code is not readily usable in modern computing environments, and
can't readily be translated into rules which ARE useful in modern
software. =C2=A0Nor is it particularly readable, if one is trying to le= arn
(decipher) the Lojban morphology rules.

> Who is waiting? =C2=A0There's probably no real market for anyt= hing more
> sophisticated than we have now. =C2=A0And the approval of "do= tside" would

Everyone, I think? =C2=A0That's why there's so much interest in= PEG
formalizations. =C2=A0What we have now is a collection of toys. =C2=A0W= hat we want
is a collection of tools. =C2=A0So far, all of our "tools" ar= e really just
assorted collections of hacks: cobbled-together bits of software which
implement approximations of Lojban, each implemented for/in its own ver= y
specific computing environment.

BTW, thank for your post RE: Jeff Prothero.

--
You received this message because you are subscribed to the Google Groups &= quot;lojban" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to lojban+unsub= scribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lo= jban/de38a6eb-d41c-4275-a293-3e9a47aefd28%40googlegroups.com.
------=_Part_2242_257847238.1588030289658-- ------=_Part_2241_1552266057.1588030289657--