From nobody@digitalkingdom.org Tue Jul 11 14:34:35 2006
Received: with ECARTIS (v1.0.0; list lojban-beginners); Tue, 11 Jul 2006 14:34:35 -0700 (PDT)
Received: from nobody by chain.digitalkingdom.org with local (Exim 4.62)	(envelope-from <nobody@digitalkingdom.org>)	id 1G0PsI-00024W-No	for lojban-beginners-real@lojban.org; Tue, 11 Jul 2006 14:34:34 -0700
Received: from ug-out-1314.google.com ([66.249.92.173])	by chain.digitalkingdom.org with esmtp (Exim 4.62)	(envelope-from <jonored@gmail.com>)	id 1G0PsF-00024M-Kn	for lojban-beginners@lojban.org; Tue, 11 Jul 2006 14:34:34 -0700
Received: by ug-out-1314.google.com with SMTP id j40so22218ugd        for <lojban-beginners@lojban.org>; Tue, 11 Jul 2006 14:34:30 -0700 (PDT)
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws;        s=beta; d=gmail.com;        h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references;        b=b0u3JRnST+YfMOseAmf/8zTuqQsr1bPUOGjRW1/j3mWw/GDs2CJByZJxO/jEVhFrvkTG+n2FB4mIp+1FFLmHzn3B7Gu0cZmoZs7UXdlaQ5kS8r1zINdNLtW+ZrCC6wK+HpiFZdTvMN4uU6zCtz3loxMLjDCQmgcPzGsjRNiZPvU=
Received: by 10.66.220.17 with SMTP id s17mr1500073ugg;        Tue, 11 Jul 2006 14:34:30 -0700 (PDT)
Received: by 10.67.30.12 with HTTP; Tue, 11 Jul 2006 14:34:30 -0700 (PDT)
Message-ID: <e202d93c0607111434y586469b1n6f265c0e6132bdfe@mail.gmail.com>
Date: Tue, 11 Jul 2006 17:34:30 -0400
From: "Jonathan Gibbons" <jonored@gmail.com>
To: lojban-beginners@lojban.org
Subject: [lojban-beginners] Re: Enumerating in Lojban
In-Reply-To: <20060711200739.GK10845@chain.digitalkingdom.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
References: <1684503175.20060710193640@mail.ru>	 <925d17560607100826x2a37ffcfi69c9964cabf0b53@mail.gmail.com>	 <537d06d00607100919v70febc62u93929e72b0041c48@mail.gmail.com>	 <20060710164123.GS3440@chain.digitalkingdom.org>	 <e202d93c0607101027w88e0fa5p858d0694a6375a6b@mail.gmail.com>	 <20060710173540.GV3440@chain.digitalkingdom.org>	 <e202d93c0607101841r2f925d26ve483782380a9ab2e@mail.gmail.com>	 <20060711052439.GC10845@chain.digitalkingdom.org>	 <e202d93c0607111135h90d1cb2t4023fdad74bc7b00@mail.gmail.com>	 <20060711200739.GK10845@chain.digitalkingdom.org>
X-Spam-Score: -2.2 (--)
X-archive-position: 3414
X-ecartis-version: Ecartis v1.0.0
Sender: lojban-beginners-bounce@lojban.org
Errors-to: lojban-beginners-bounce@lojban.org
X-original-sender: jonored@gmail.com
Precedence: bulk
Reply-to: lojban-beginners@lojban.org
X-list: lojban-beginners

> It's split pretty evenly.  In this case, both versions are valid,
> but there are other cases where that's not true.

It would be remarkably helpful if you could provide an example of a
sentence and non-sentence for which the absence or presence of an
elidable terminator means that the string is or is not in Lojban; it
would also be the core of a proof by counterexample of Lojban being
not context-free; a string for with the terminator's inclusion of
elision means that there must not exist or must exist a derivation
from the grammar of that string. I have yet to see a string where it
seems sensible to exclude it from the language on the basis of a
missing terminator.

> If your incredibly vague statements above are referring to PEG, it
> is definately not Turing complete, because there are knows languages
> that it can't express.  The PEG literature gives examples.

Whether or not it has the capability to recognize any language /on
it's normal input/ does not actually determine whether or not it is
turing-complete; it would be sufficient if there exists some way of
arranging start conditions (input and grammar, I would take it, unless
there are more things defining it's behavior) such that a language and
an input are represented in some fashion on the input, and determine
some form of success or failure. Alternately, the ability to emulate
in some fashion any of a rather large proportion of other formalisms
would be quite sufficient.

> I believe that Lojban is not expressible as a BNF, IOW, is not a
> CFG.  I don't have the ability to formalize a proof.  I have,
> however, seen *no* evidence to the contrary.  If you can produce a
> BNF that handles even, say, 5 of Lojban's terminators correctly,
> then I'll have evidence.  Good lick.
>

It would be very nice to know what "The Right Thing" actually means,
given that everything I've read defining it seems to point towards the
use of elidable terminators as overrides for some default grouping,
rather than as elements that are required based on context for some
strings to be in the language. If that is so, then I'll go ahead and
define a related language that /is/ context-free instead of
artificially non-context-free, with as much overlap with Lojban as I
can, and will carry on with that new language.

> > I have also been working on writing a transformation code to go
> > from the EBNF that bnf.300 uses to one that fits bison's input
> > format, with elidable terminators as optional elements.
>
> It's not going to work; I've already tried it.

Now I'm curious; what algorithm was it using? optional elements
certainly would cause shift/reduce and reduce/reduce conflicts all
over the place with bison's default algorithm, which is an LALR
algorithm. But then, if you were trying to cause it to fail some of
the time when something was elided and thereby caused one of the
not-allowed ambiguities then of course it would.

> Precedence and grouping rules might fix it, but then that's outside
> of the BNF formalism, which means it's not a formal specification
> anymore, so why bother?  There are already 2 ad hoc Lojban parsers;
> I don't see that we need another one.

Precedence and grouping cannot change the set of strings in a
language, only the interpretation of said strings. They tend to be
cleaner at doing it than trying to rewrite the grammar directly to be
completely unambiguous, as well. As such, the precedence and grouping
cannot change the answer of "yes, that is lojban" or "no, that's not
lojban."

> > then "le nu broda ku brode" and "le nu broda kei brode" are
> > grammatical, but "le nu broda brode", eliding both terminators, is
> > not.  But if we rewrote the elidable terminators as optional
> > elements, then "le nu broda brode" would be grammatical and
> > ambiguous.
>

I've read this at least three or four times already; given that
grammar and a general context-free language parser, rather than an
LALR(1) parser, it would find the derivation that splits "le nu broda
brode" into "(le nu broda) (brode)" by the start symbol, and then
"((LE) (NU broda)) (brode)" and so on, becuase with the requirement of
a sumti and a selbri, that statement is /not/ ambiguous. If it were,
then you could use a precedence and grouping rule to decide which
derivation is intended, and therefore disambiguate it, the same way
that "1+2*3" is disambiguated in many programming languages. Using
precedence rules to disambiguate does not change whether a language is
context-free or not, because "context-free" is a statement about how
one may determine whether or not a string is in a language or not, not
about how one interprets any given string; it just is a way of
describing a particular derivation as the intended derivation.

> Not to claim that this can be done in general (maybe it can, I don't
> know), I would like to point out that this particular example can
> easily be fixed by re-iterating the 'selbri' rule in sumti, with
> minor modifications:
>
>         start = sumti selbri
>         sumti = LE tanru /KU/
>                 | LE NU tanru /KEI/ /KU/
>                 | LE NU tanru /KU/
>                 | LE NU tanru /KEI/
>         selbri = tanru | NU tanru /KEI/
>         tanru = BRIVLA | tanru BRIVLA
>

That certainly does remove the (unambiguous given the previous
grammar) statement "le nu broda brode", if the elements in "/"s are
now required. Although it does look like there would need to be an
option under selbri for "NU tanru" by itself, to accept all of the
above statements. If, on the other hand, they are optional, then it
accepts the same language as the above. If they're a hack, then
they're still a hack.

That said, I suspect this discussion may be outside the scope of the
beginners' list; would it be more polite to swap to a less
spam-everyone forum?

 -Jonathan