[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[lojban] Re: CFG prize challenge question



On Tue, Feb 10, 2009 at 06:11, Jorge Llambías <jjllambias@gmail.com> wrote:
> On Mon, Feb 9, 2009 at 9:42 PM, Chris Capel <pdf23ds@gmail.com> wrote:
>> On Mon, Feb 9, 2009 at 17:31, Jorge Llambías <jjllambias@gmail.com> wrote:
>>> By "blowup" you mean the number of rules increases exponentially with
>>> the number of terminators, right?
>>
>> Well, more like with the number of allowable parent-child phrase
>> relationships. E.g., you can nest a sumti in a selbri, so that's one.
>
> Do you mean things like "ME sumti MEhU" or "BRIVLA BE sumti BEhO"?
>
> I'm still not clear what a "phrase" is here.

Well, I guess you could say a phrase is a start and end token and
everything between them. So both your examples are phrases.

>> You can put a selbri in a bridi, so that's two. You can put a TO-TOI
>> in lots of places, so that's like ten more. Etc. And I don't assert
>> that the blowup is exponential--I'd sooner believe it's polynomial.
>> But either way I'm really not to sure how to go about *proving* it.
>
> The number of nestings allowed is already infinite with terminators.
> It doesn't increase when terminators can be elided.

Right, but I think the blowup might be a function of the number of
*unique* nestings allowed, which determines how complex the grammar
has to be. Many (most?) CFGs generate an infinite number of strings,
but some are more complex than others.

> But do "LE ME KOhA MEhU KU" and "LE NU BRODA KEI KU" count as the same
> phrase or different phrases? They are both "sumti = LE selbri KU". I'm
> still not sure what you are counting as a "phrase".

They're different per the definition above, as long as you're
assigning different tokens to ME and NU.

>> A selbri has internal structure, including some terminators (I
>> believe) so it would be represented as various phrase types.
>
> "various" or "infinite"? A selbri can have an infinite number of
> terminators nested inside. It doesn't have any terminator of its own.

An infinite number of different phrases, I guess.

>>> Is "GI NAI" a phrase? It is not by itself "valid" as in a
>>> valid text to be parsed, but there is a rule "gik = GI NAI".
>>
>> I don't think NAI is really a terminator, so "GI NAI" wouldn't be
>> represented in my formalism.
>
> Right, NAI is not a terminator. So a "phrase" is anything that has a
> terminator? Do "PA BOI" and "PA PA BOI" count as different phrases?

Lerfu strings are an interesting case because they don't really have a
start token. They just have a bunch of contents tokens, any one of
which can start a lerfu phrase. But I think they can still be
represented adequately with start and end tokens. So in this case, the
two examples would count as the same phrase. There doesn't have to be
a direct correspondence between valsi and tokens. A start token can
represent the start of a structure however that start is introduced.

>> The concrete syntax I've been using is as follows: "(x" is a start
>> token of type x, ")x" is a terminator token, non-elided, and "_)x" is
>> an elided terminator. So an example phrase would be
>>
>> (1 (2 (3 (4 )4 _)3 _)2 (3 )3 )1
>>
>> This phrase is invalid, because of the second 3 phrase. Since the 2
>> terminator is elided, the second 3 phrase would have to be grouped
>> under the 2 phrase after the first 3 phrase. In other words, if you
>> removed the elided terminators and reconstructed them, the 2
>> terminator would go after the second 3 phrase.
>
> How can you tell that the second 3 phrase can be grouped under the 2
> phrase? How can you tell it can't be grouped under the first 3 phrase?

Sorry. I'm assuming that 1s can be parents of 2s, 2s of 3s, 3-4, 1-3.
For each grammar you have a list of those pairs defining what's
allowed. I might have been too assuming to leave that implicit.

>> I think you can see how a unique number could be assigned to each
>> selma'o that starts a terminable phrase, and the allowed nestings
>> declared as assumptions about the mapPhPh function, to make a mapping
>> between lojban and the formalism.
>
> Not all terminators are coupled with fixed "starter" selmaho, and some
> terminators can terminate different structures. This may or may not be
> relevant, I'm still not sure I see how this formalism works.

I don't think this matters to the fidelity of the formalism. Really,
more than specific start or trm tokens in Lojban, the start and trm
tokens here correspond to grammatical types in lojban. You'd have
multiple types with the same terminator, which in the formal system
don't need the same number.

Chris Capel
-- 
"What is it like to be a bat? What is it like to bat a bee? What is it
like to be a bee being batted? What is it like to be a batted bee?"
-- The Mind's I (Hofstadter, Dennet)


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.