[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[lojban] Re: CFG prize challenge question



On Mon, Feb 9, 2009 at 17:31, Jorge Llambías <jjllambias@gmail.com> wrote:
> On Sun, Feb 8, 2009 at 6:36 PM, Chris Capel <pdf23ds@gmail.com> wrote:
>> I'm currently trying to prove that a CFG can't parse grammars with
>> elidable terminators without a blowup in productions.
>
> By "blowup" you mean the number of rules increases exponentially with
> the number of terminators, right?

Well, more like with the number of allowable parent-child phrase
relationships. E.g., you can nest a sumti in a selbri, so that's one.
You can put a selbri in a bridi, so that's two. You can put a TO-TOI
in lots of places, so that's like ten more. Etc. And I don't assert
that the blowup is exponential--I'd sooner believe it's polynomial.
But either way I'm really not to sure how to go about *proving* it.

>> The set of valid phrases to be parsed, P, is a set of lists of tokens.
>
> First question: Do I understand correctly that a phrase is something
> like "KOhA BRODA LE BRODA KU VAU"?

Well, I'm thinking more abstractly. I don't think KOhA has a
terminator, so I'm just assuming that kind of word is irrelevant. And
I'm representing all terminators, so I think you'd need "BRODA LE
BRODA VAU KU VAU". But that's basically it.

> The number of valid (and finite) such phrases with non-elidable
> terminators is countably infinite, and so is the number with elidable
> terminators. There is no blowup there. It is the number of rules,
> things like "sumti = LE selbri KU", which is finite, that probably
> blows up.

That's the idea.

> Or is a phrase something like "LE selbri KU", where "selbri" is one of
> its tokens?

A selbri has internal structure, including some terminators (I
believe) so it would be represented as various phrase types.

> Is "GI NAI" a phrase? It is not by itself "valid" as in a
> valid text to be parsed, but there is a rule "gik = GI NAI".

I don't think NAI is really a terminator, so "GI NAI" wouldn't be
represented in my formalism.

>> There are two types of tokens: start and terminator. Each token is
>> paired with a number identifying the type of phrase it starts or
>> terminates.
>
> What are the types of phrases? Can you give some examples of tokens,
> some examples of phrases, and some examples of types of phrases?

The concrete syntax I've been using is as follows: "(x" is a start
token of type x, ")x" is a terminator token, non-elided, and "_)x" is
an elided terminator. So an example phrase would be

(1 (2 (3 (4 )4 _)3 _)2 (3 )3 )1

This phrase is invalid, because of the second 3 phrase. Since the 2
terminator is elided, the second 3 phrase would have to be grouped
under the 2 phrase after the first 3 phrase. In other words, if you
removed the elided terminators and reconstructed them, the 2
terminator would go after the second 3 phrase.

I think you can see how a unique number could be assigned to each
selma'o that starts a terminable phrase, and the allowed nestings
declared as assumptions about the mapPhPh function, to make a mapping
between lojban and the formalism.

Chris Capel
-- 
"What is it like to be a bat? What is it like to bat a bee? What is it
like to be a bee being batted? What is it like to be a batted bee?"
-- The Mind's I (Hofstadter, Dennet)


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.