From lojban+bncCMHEmaCOBhDnxejlBBoEyv0FwA@googlegroups.com Sat Oct 16 15:11:03 2010 Received: from mail-gy0-f189.google.com ([209.85.160.189]) by chain.digitalkingdom.org with esmtp (Exim 4.72) (envelope-from ) id 1P7Exu-0007Pc-Ve; Sat, 16 Oct 2010 15:11:02 -0700 Received: by gyh3 with SMTP id 3sf2236195gyh.16 for ; Sat, 16 Oct 2010 15:10:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=beta; h=domainkey-signature:received:x-beenthere:received:received:received :received:received-spf:received:mime-version:received:received :in-reply-to:references:date:message-id:subject:from:to :x-original-sender:x-original-authentication-results:reply-to :precedence:mailing-list:list-id:list-post:list-help:list-archive :sender:list-subscribe:list-unsubscribe:content-type; bh=D7HoMeFPC1n3ePWuYT2JeHz5bkuj4x9YTYKdL3ufCSw=; b=b8r4p9dsFUsRmwCt7me6Zcu5B7CkbUCQrSCw9YTDqL21YwEXhZZE8JQwGFhBqFWhwN bjBhdaRCM2Yb68ZHKd90yacB+GjJ1MKoa5V2i6tV8ZbDmzPq5IzFQ+8C/WxEUxsK6MMA D53ekYax1EGbHPLbRWMw5p9+1IO9Bt557Z4NQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlegroups.com; s=beta; h=x-beenthere:received-spf:mime-version:in-reply-to:references:date :message-id:subject:from:to:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:sender:list-subscribe :list-unsubscribe:content-type; b=skfRzHvM4ytzj4nTIjx2rofDH8rGPVmbb6OyGTyj2vEuzpfBKdp+HyTW0avzYVJirS yeFItivsClPUdq52sMGEdH0YM6t+NN/vVNiYTBJrkWglfo4UwTZxYQSl/c8f+0FxAySf 41LCzzNlsFvj65qdiRepshtUubtQgqm+9tj2w= Received: by 10.150.72.10 with SMTP id u10mr68561yba.39.1287267047148; Sat, 16 Oct 2010 15:10:47 -0700 (PDT) X-BeenThere: lojban@googlegroups.com Received: by 10.231.180.73 with SMTP id bt9ls3066482ibb.0.p; Sat, 16 Oct 2010 15:10:46 -0700 (PDT) Received: by 10.231.192.73 with SMTP id dp9mr1298244ibb.16.1287267046348; Sat, 16 Oct 2010 15:10:46 -0700 (PDT) Received: by 10.231.192.73 with SMTP id dp9mr1298243ibb.16.1287267046309; Sat, 16 Oct 2010 15:10:46 -0700 (PDT) Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com [209.85.214.182]) by gmr-mx.google.com with ESMTP id j25si4417024ibb.4.2010.10.16.15.10.45; Sat, 16 Oct 2010 15:10:45 -0700 (PDT) Received-SPF: pass (google.com: domain of eyeonus@gmail.com designates 209.85.214.182 as permitted sender) client-ip=209.85.214.182; Received: by mail-iw0-f182.google.com with SMTP id 41so1112748iwn.13 for ; Sat, 16 Oct 2010 15:10:45 -0700 (PDT) MIME-Version: 1.0 Received: by 10.231.10.134 with SMTP id p6mr109252ibp.50.1287266661206; Sat, 16 Oct 2010 15:04:21 -0700 (PDT) Received: by 10.231.208.15 with HTTP; Sat, 16 Oct 2010 15:04:21 -0700 (PDT) In-Reply-To: <20101016185701.GB10877@alice.local> References: <385d6b2f-c484-494b-9241-6d7429ce0ec3@p20g2000prf.googlegroups.com> <20101014234221.GC2916@alice.local> <20101016185701.GB10877@alice.local> Date: Sat, 16 Oct 2010 16:04:21 -0600 Message-ID: Subject: Re: [lojban] Re: Questions on isolating utterances before completely parsing From: Jonathan Jones To: lojban@googlegroups.com X-Original-Sender: eyeonus@gmail.com X-Original-Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of eyeonus@gmail.com designates 209.85.214.182 as permitted sender) smtp.mail=eyeonus@gmail.com; dkim=pass (test mode) header.i=@gmail.com Reply-To: lojban@googlegroups.com Precedence: list Mailing-list: list lojban@googlegroups.com; contact lojban+owners@googlegroups.com List-ID: List-Post: , List-Help: , List-Archive: Sender: lojban@googlegroups.com List-Subscribe: , List-Unsubscribe: , Content-Type: multipart/alternative; boundary=0022152d7fed128d7b0492c320af --0022152d7fed128d7b0492c320af Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable On Sat, Oct 16, 2010 at 12:57 PM, .alyn.post. < alyn.post@lodockikumazvati.org> wrote: > On Sat, Oct 16, 2010 at 09:46:00AM -0700, symuyn wrote: > > In reply to Mr. Post, saving and using continuations is a very > > interesting idea, but unfortunately, I don't see how it would be > > practically usable when it comes to editing near the beginning=97or eve= n > > middle!=97of the document. Hypothetically, if you have a long document, > > editing it even in the middle would take a long time to process for > > each re-parse. > > > > The two points that you give at the end to ameliorate continuations' > > problems are interesting but very difficult, as far as I can tell. > > Perhaps you can give some answers=97 > > > > Providing feedback during parsing of text downstream of the editing is > > impossible as far as I can tell=97every PEG library I know=97including = the > > ones that I've written=97is a sealed black box: once you plug something > > in, you must wait until it finishes getting the result out. > > > > The PEG parser I'm using requires you to write a generator for token > input, which is the first place I'd try putting continuations: > > http://gazette.call-cc.org/issues/5.html > > Meaning, I'd manage my continuations on the input side, rather than > the output side, because you're right--stuff pops out fully formed. > > I suspect I'd have to hack at the parser some after this to make > continuations work as I expected, but I already expected I'd be > learning and fixing this particular parser anyway. > > If you can save to and seek back to the input position for a > particular parse, save the state variables of the parser, and save > the syntax tree, it really should be possible. > > While I have extensive experience with parsers, I have very little > experience with PEG parsers, so I accept that something about these > parsers may make that difficult. I wouldn't try to do something > like this with a recursive descent parser, because it saves state on > the stack. I might be motivated enough to convert a recursive descent > parser into a continuation passing style parser, which would allow > me to save the stack-based representation of the parse on the heap > if I needed to create a continuation. > > > > Comparing parse trees and stopping re-parsing when they're > > sufficiently similar is risky, if there is no way to guarantee that > > the syntax tree is exactly the same all the way to the end *without re- > > parsing the whole thing anyway*. As far as I can tell, just because a > > new parse tree starts to look similar to the original tree, the new > > parse tree is not necessarily identical till the end. (Or is that > > actually a property of the Lojban grammar? If it is, only then should > > early stopping by comparison be used.) > > > > You're correct. Sufficiently similar is a heuristic that won't work > for all cases. I was suggesting that this trade-off was better than > the other suggested trade-offs for solving this problem. As you're > the one writing this thing, you get to decide which trade-off you > want to deal with. ;-) > > What I was thinking is the the parser would run as a separate > thread, and the parse tree in the main thread would contain in it a > marshall object at the current parse location. Occassionally this > marshall object would receive more of the parse tree and a new > marshall object, replacing the old marshall object with the new bit > of parse tree and the place it was still working. > > I wasn't thinking of the "all-or-nothing" property of PEG parsers, > as this idea does require the PEG parser to check back in with the > caller from time to time. > > > > However, continuations *are* indeed the only way that I can now think > > of to implement practically such a text editor *without* requiring > > LIhU, etc. to be at the end of multi-utterance texts. > > > > I don't personally elide LIhU, LUhU, TOI, &c myself. > > I know that Eclipse can parse python (and presumably other languages) > in its editor. I also know that this parser is not identical to the > way that python parsers itself, as my coworkers have complained to > me that Eclipse flagged a perfectly valid python construct as > invalid when in fact python did something quite useful with the same > construct. > > This doesn't stop people from using Eclipse to write python, and > serves as prior art on how people cope (and programs don't cope) with > this sort of thing. > > In our case we add little tokens in the code that Eclipse recognizes > but python doesn't, so Eclipse will shut up and python will work. > > I am curious to hear how this project goes for you. What platform > are you targeting? What language are you using? What PEG parser > are you trying? No one has tried doing anything like this with > Lojban before, which makes me quite excited. > > -Alan > -- > .i ko djuno fi le do sevzi > Does it have to use PEG? .camxes. uses a RATS! parser, maybe that is less o= f a "black box" and would work better for this? I'm only guessing, as I don't know thing 2 about parsers in general. --=20 mu'o mi'e .aionys. .i.a'o.e'e ko cmima le bende pe lo pilno be denpa bu .i doi.luk. mi patfu d= o zo'o (Come to the Dot Side! Luke, I am your father. :D ) --=20 You received this message because you are subscribed to the Google Groups "= lojban" group. To post to this group, send email to lojban@googlegroups.com. To unsubscribe from this group, send email to lojban+unsubscribe@googlegrou= ps.com. For more options, visit this group at http://groups.google.com/group/lojban= ?hl=3Den. --0022152d7fed128d7b0492c320af Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: quoted-printable
On Sat, Oct 16, 2010 at 12:57 PM, .alyn.post. <alyn.= post@lodockikumazvati.org> wrote:
On Sat, Oct 16, 2010 at 09:46:00AM -0700, symuyn wrote: > In reply to Mr. Post, saving and using continuations is a very
> interesting idea, but unfortunately, I don't see how it would be > practically usable when it comes to editing near the beginning=97or ev= en
> middle!=97of the document. Hypothetically, if you have a long document= ,
> editing it even in the middle would take a long time to process for > each re-parse.
>
> The two points that you give at the end to ameliorate continuations= 9;
> problems are interesting but very difficult, as far as I can tell.
> Perhaps you can give some answers=97
>
> Providing feedback during parsing of text downstream of the editing is=
> impossible as far as I can tell=97every PEG library I know=97including= the
> ones that I've written=97is a sealed black box: once you plug some= thing
> in, you must wait until it finishes getting the result out.
>

The PEG parser I'm using requires you to write a generator for to= ken
input, which is the first place I'd try putting continuations:

=A0= http://gazette.call-cc.org/issues/5.html

Meaning, I'd manage my continuations on the input side, rather than
the output side, because you're right--stuff pops out fully formed.

I suspect I'd have to hack at the parser some after this to make
continuations work as I expected, but I already expected I'd be
learning and fixing this particular parser anyway.

If you can save to and seek back to the input position for a
particular parse, save the state variables of the parser, and save
the syntax tree, it really should be possible.

While I have extensive experience with parsers, I have very little
experience with PEG parsers, so I accept that something about these
parsers may make that difficult. =A0I wouldn't try to do something
like this with a recursive descent parser, because it saves state on
the stack. =A0I might be motivated enough to convert a recursive descent parser into a continuation passing style parser, which would allow
me to save the stack-based representation of the parse on the heap
if I needed to create a continuation.


> Comparing parse trees and stopping re-parsing when they're
> sufficiently similar is risky, if there is no way to guarantee that > the syntax tree is exactly the same all the way to the end *without re= -
> parsing the whole thing anyway*. As far as I can tell, just because a<= br> > new parse tree starts to look similar to the original tree, the new > parse tree is not necessarily identical till the end. (Or is that
> actually a property of the Lojban grammar? If it is, only then should<= br> > early stopping by comparison be used.)
>

You're correct. =A0Sufficiently similar is a heuristic that won&#= 39;t work
for all cases. =A0I was suggesting that this trade-off was better than
the other suggested trade-offs for solving this problem. =A0As you're the one writing this thing, you get to decide which trade-off you
want to deal with. =A0;-)

What I was thinking is the the parser would run as a separate
thread, and the parse tree in the main thread would contain in it a
marshall object at the current parse location. =A0Occassionally this
marshall object would receive more of the parse tree and a new
marshall object, replacing the old marshall object with the new bit
of parse tree and the place it was still working.

I wasn't thinking of the "all-or-nothing" property of PEG par= sers,
as this idea does require the PEG parser to check back in with the
caller from time to time.


> However, continuations *are* indeed the only way that I can now think<= br> > of to implement practically such a text editor *without* requiring
> LIhU, etc. to be at the end of multi-utterance texts.
>

I don't personally elide LIhU, LUhU, TOI, &c myself.

I know that Eclipse can parse python (and presumably other languages)
in its editor. =A0I also know that this parser is not identical to the
way that python parsers itself, as my coworkers have complained to
me that Eclipse flagged a perfectly valid python construct as
invalid when in fact python did something quite useful with the same
construct.

This doesn't stop people from using Eclipse to write python, and
serves as prior art on how people cope (and programs don't cope) with this sort of thing.

In our case we add little tokens in the code that Eclipse recognizes
but python doesn't, so Eclipse will shut up and python will work.

I am curious to hear how this project goes for you. =A0What platform
are you targeting? =A0What language are you using? =A0What PEG parser
are you trying? =A0No one has tried doing anything like this with
Lojban before, which makes me quite excited.

-Alan
--
.i ko djuno fi le do sevzi

Does it have to use = PEG? .camxes. uses a RATS! parser, maybe that is less of a "black box&= quot; and would work better for this? I'm only guessing, as I don't= know thing 2 about parsers in general.

--
mu'o mi'e .aionys.

.i.a'o.e'e ko cmima le= bende pe lo pilno be denpa bu .i doi.luk. mi patfu do zo'o
(Come to= the Dot Side! Luke, I am your father. :D )

--
You received this message because you are subscribed to the Google Groups "= lojban" group.
To post to this group, send email to lojban@googlegroups.com.
To unsubscribe from this group, send email to lojban+unsubscribe@googlegrou= ps.com.
For more options, visit this group at http://groups.google.com/group/lojban= ?hl=3Den.
--0022152d7fed128d7b0492c320af--