[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[lojban] Re: experimental cmavo in lojgloss.





On Thu, Nov 6, 2008 at 1:30 AM, Chris Capel <pdf23ds@gmail.com> wrote:
On Wed, Nov 5, 2008 at 18:07, Daniel Brockman <daniel@gointeractive.se> wrote:
>> > The obvious way to implement {lo'ai .. sa'ai .. le'ai} in a parser is to
>> > just treat it as a self-contained construct that requires
>> > morphologically
>> > correct Lojban inside it, just like {lo'u .. le'u'}, and syntactically
>> > correct Lojban before it (just like everything else).
>>
>> How far before it? Up to the beginning of the sentence? The statement?
>
> The {le'ai} construct doesn't care about ANYTHING else.  However your parser
> works, that's how it works before {le'ai}.

I don't understand. You're saying that if there's a lo'ai then
everything before it in the text should get only a syntactical parse,
not a grammatical parse? If not, there has to be some cutoff.

Syntax and grammar is one and the same thing to me, so I don't understand the distinction.
 
>> > Of course it would require extraordinary methods to get things like
>> > {kwama
>> > lo'ai kwama sa'ai klama le'ai} --- or why not {fsen.45ynl5tnerg98ehg4n
>> > su
>> > coi} --- to parse.  It's not practical and not cost-efficient.  The
>> > {kjama}
>> > example falls in this category because {kj} is morphologically invalid.
>>
>> Hmm. I think you overestimate the difference in effort between the two
>> implementations. They both require the same tricks, just at a slightly
>> different level in the grammar.
>
> What are you talking about?  One implementation is self-contained; the other
> requires lots of weird backtracking and re-parsing and weird, weird stuff.

No, both require backtracking (but not reparsing, since this is a
packrat parser) and lots of lookahead that's usually wasted (but
hopefully fast). You have to check every sentence (or whatever) for
lo'ai before the main grammar parse, whether you do it before or after
the morph parse. If you want to see how that's implemented, take a
look at SA. Now, SA has a lot more complicated grammar, so lo'ai would
be easier to implement even using the same technique. (And contrary to
Jorge, I'm not too sure it would introduce any weird interactions with
the SA machinery.)

I'm still not getting through.  We are talking about two different things.
 
> It doesn't matter if it has the same parse tree.  It only matters that it
> PARSES IN ANY WAY.  If it does, then the parser will be able to continue.
> If it doesn't, then the parser will die.

I'm more concerned about interactive parsing where parse errors aren't
a huge deal, especially because you get detailed and helpful error
information, much, much better than jbofi'e, to help you find the
problem.

I think perhaps a better (simple) way to handle lo'ai is to treat it
similar to a plain-old lo'u - le'u quote. Still have it behave like a
UI, but only morph parse the words until the le'ai. In fact, I imagine
a number of experimental cmavo that create new selmaho could be
handled cursorily as quotes of this kind. It's not ideal, but it
allows a non-expert user to modify the parser with configuration to
handle text using these cmavo better than before.

Yes, this is what I've been trying to say.  Thank you.  Just handle it like a parenthetical _expression_.

The more complicated implementation that actually replaces at parse time is another discussion (which I've been trying to avoid in order to keep this simple, but by all means continue if it is interesting to you).

I'm not even sure I'd want my parser to erase and replace stuff.  I consider an erasure or a replacement to be an additional utterance that is often best understood as such.  It would even be interesting to make a parser that could parse through errors and resynchronize later (e.g., when {.i} is encountered), and things like that.

Anyway, I'm in over my head.  I'm not a parser expert.

--
Daniel Brockman
daniel@brockman.se