From lojban+bncCLr6ktCfBBCE6-flBBoEl7ysPQ@googlegroups.com Sat Oct 16 11:57:25 2010
Received: from mail-pz0-f61.google.com ([209.85.210.61])
	by chain.digitalkingdom.org with esmtp (Exim 4.72)
	(envelope-from <lojban+bncCLr6ktCfBBCE6-flBBoEl7ysPQ@googlegroups.com>)
	id 1P7BwX-0003AS-0K; Sat, 16 Oct 2010 11:57:25 -0700
Received: by pzk2 with SMTP id 2sf1156220pzk.16
        for <multiple recipients>; Sat, 16 Oct 2010 11:57:15 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=googlegroups.com; s=beta;
        h=domainkey-signature:received:x-beenthere:received:received:received
         :received:received-spf:received:received:received:date:from:to
         :subject:message-id:mail-followup-to:references:mime-version
         :in-reply-to:x-original-sender:x-original-authentication-results
         :reply-to:precedence:mailing-list:list-id:list-post:list-help
         :list-archive:sender:list-subscribe:list-unsubscribe:content-type
         :content-disposition:content-transfer-encoding;
        bh=iBvvF70kBITovfqwQxW5HSiSDrhEvqaGbnPsCfWB6qw=;
        b=2x5/bKaCkPAkx2IUGRnNsfo4uWc6B4xunZ4SSy0+Tepe48bUEDL1pk4mCTA55vBr5w
         knkVpZJeACyiWhKWwbBzWtStEqcS/oD/t5Lj9S3wZ5IW4IwpDIw4uLrtAdtpKb1b9sm8
         m6fSIq4mGB37GtIRuxr9Qwb+oO/Gy1fppTVtA=
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=googlegroups.com; s=beta;
        h=x-beenthere:received-spf:date:from:to:subject:message-id
         :mail-followup-to:references:mime-version:in-reply-to
         :x-original-sender:x-original-authentication-results:reply-to
         :precedence:mailing-list:list-id:list-post:list-help:list-archive
         :sender:list-subscribe:list-unsubscribe:content-type
         :content-disposition:content-transfer-encoding;
        b=aRIhpTcZVMh6c3vWQCjrqGG3FJV0ZGu1TsaT+s3wYrs7uB9I+NGYcUJE+yo6zdDUL6
         G70mOElhLkP0xDYakSPb1b3rH8troRheJ13OIS8t7zSpAYyIREGw4wDqvflU037VMvd2
         h9PCpC1Sg3IzpMpttBGD5cHT9d9IKYjxVtig0=
Received: by 10.142.248.35 with SMTP id v35mr61700wfh.34.1287255428543;
        Sat, 16 Oct 2010 11:57:08 -0700 (PDT)
X-BeenThere: lojban@googlegroups.com
Received: by 10.142.2.41 with SMTP id 41ls3073546wfb.0.p; Sat, 16 Oct 2010
 11:57:07 -0700 (PDT)
Received: by 10.142.52.17 with SMTP id z17mr1195740wfz.61.1287255427461;
        Sat, 16 Oct 2010 11:57:07 -0700 (PDT)
Received: by 10.142.52.17 with SMTP id z17mr1195739wfz.61.1287255427433;
        Sat, 16 Oct 2010 11:57:07 -0700 (PDT)
Received: from mail-pv0-f180.google.com (mail-pv0-f180.google.com [74.125.83.180])
        by gmr-mx.google.com with ESMTP id n6si13645983wfl.7.2010.10.16.11.57.07;
        Sat, 16 Oct 2010 11:57:07 -0700 (PDT)
Received-SPF: neutral (google.com: 74.125.83.180 is neither permitted nor denied by best guess record for domain of alanpost@sunflowerriver.org) client-ip=74.125.83.180;
Received: by pvg6 with SMTP id 6so268185pvg.39
        for <lojban@googlegroups.com>; Sat, 16 Oct 2010 11:57:07 -0700 (PDT)
Received: by 10.142.253.8 with SMTP id a8mr1854360wfi.137.1287255425876;
        Sat, 16 Oct 2010 11:57:05 -0700 (PDT)
Received: from sunflowerriver.org (c-68-35-167-179.hsd1.nm.comcast.net [68.35.167.179])
        by mx.google.com with ESMTPS id q13sm18621005wfc.5.2010.10.16.11.57.03
        (version=TLSv1/SSLv3 cipher=RC4-MD5);
        Sat, 16 Oct 2010 11:57:04 -0700 (PDT)
Date: Sat, 16 Oct 2010 12:57:01 -0600
From: ".alyn.post." <alyn.post@lodockikumazvati.org>
To: lojban@googlegroups.com
Subject: Re: [lojban] Re: Questions on isolating utterances before completely parsing
Message-ID: <20101016185701.GB10877@alice.local>
Mail-Followup-To: lojban@googlegroups.com
References: <385d6b2f-c484-494b-9241-6d7429ce0ec3@p20g2000prf.googlegroups.com> <20101014234221.GC2916@alice.local> <d555294f-7d33-4c29-89e8-9fdf5f745b1f@e22g2000prj.googlegroups.com>
Mime-Version: 1.0
In-Reply-To: <d555294f-7d33-4c29-89e8-9fdf5f745b1f@e22g2000prj.googlegroups.com>
X-Original-Sender: alyn.post@lodockikumazvati.org
X-Original-Authentication-Results: gmr-mx.google.com; spf=neutral (google.com:
 74.125.83.180 is neither permitted nor denied by best guess record for domain
 of alanpost@sunflowerriver.org) smtp.mail=alanpost@sunflowerriver.org
Reply-To: lojban@googlegroups.com
Precedence: list
Mailing-list: list lojban@googlegroups.com; contact lojban+owners@googlegroups.com
List-ID: <lojban.googlegroups.com>
List-Post: <http://groups.google.com/group/lojban/post?hl=en_US>, <mailto:lojban@googlegroups.com>
List-Help: <http://groups.google.com/support/?hl=en_US>, <mailto:lojban+help@googlegroups.com>
List-Archive: <http://groups.google.com/group/lojban?hl=en_US>
Sender: lojban@googlegroups.com
List-Subscribe: <http://groups.google.com/group/lojban/subscribe?hl=en_US>, <mailto:lojban+subscribe@googlegroups.com>
List-Unsubscribe: <http://groups.google.com/group/lojban/subscribe?hl=en_US>, <mailto:lojban+unsubscribe@googlegroups.com>
Content-Type: text/plain; charset=windows-1252
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Sat, Oct 16, 2010 at 09:46:00AM -0700, symuyn wrote:
> In reply to Mr. Post, saving and using continuations is a very
> interesting idea, but unfortunately, I don't see how it would be
> practically usable when it comes to editing near the beginning=97or even
> middle!=97of the document. Hypothetically, if you have a long document,
> editing it even in the middle would take a long time to process for
> each re-parse.
>=20
> The two points that you give at the end to ameliorate continuations'
> problems are interesting but very difficult, as far as I can tell.
> Perhaps you can give some answers=97
>=20
> Providing feedback during parsing of text downstream of the editing is
> impossible as far as I can tell=97every PEG library I know=97including th=
e
> ones that I've written=97is a sealed black box: once you plug something
> in, you must wait until it finishes getting the result out.
>=20

The PEG parser I'm using requires you to write a generator for token
input, which is the first place I'd try putting continuations:

  http://gazette.call-cc.org/issues/5.html

Meaning, I'd manage my continuations on the input side, rather than
the output side, because you're right--stuff pops out fully formed.

I suspect I'd have to hack at the parser some after this to make
continuations work as I expected, but I already expected I'd be
learning and fixing this particular parser anyway.

If you can save to and seek back to the input position for a
particular parse, save the state variables of the parser, and save
the syntax tree, it really should be possible.

While I have extensive experience with parsers, I have very little
experience with PEG parsers, so I accept that something about these
parsers may make that difficult.  I wouldn't try to do something
like this with a recursive descent parser, because it saves state on
the stack.  I might be motivated enough to convert a recursive descent
parser into a continuation passing style parser, which would allow
me to save the stack-based representation of the parse on the heap
if I needed to create a continuation.


> Comparing parse trees and stopping re-parsing when they're
> sufficiently similar is risky, if there is no way to guarantee that
> the syntax tree is exactly the same all the way to the end *without re-
> parsing the whole thing anyway*. As far as I can tell, just because a
> new parse tree starts to look similar to the original tree, the new
> parse tree is not necessarily identical till the end. (Or is that
> actually a property of the Lojban grammar? If it is, only then should
> early stopping by comparison be used.)
>=20

You're correct.  Sufficiently similar is a heuristic that won't work
for all cases.  I was suggesting that this trade-off was better than
the other suggested trade-offs for solving this problem.  As you're
the one writing this thing, you get to decide which trade-off you
want to deal with.  ;-)

What I was thinking is the the parser would run as a separate
thread, and the parse tree in the main thread would contain in it a
marshall object at the current parse location.  Occassionally this
marshall object would receive more of the parse tree and a new
marshall object, replacing the old marshall object with the new bit
of parse tree and the place it was still working.

I wasn't thinking of the "all-or-nothing" property of PEG parsers,
as this idea does require the PEG parser to check back in with the
caller from time to time.


> However, continuations *are* indeed the only way that I can now think
> of to implement practically such a text editor *without* requiring
> LIhU, etc. to be at the end of multi-utterance texts.
>=20

I don't personally elide LIhU, LUhU, TOI, &c myself.

I know that Eclipse can parse python (and presumably other languages)
in its editor.  I also know that this parser is not identical to the
way that python parsers itself, as my coworkers have complained to
me that Eclipse flagged a perfectly valid python construct as
invalid when in fact python did something quite useful with the same
construct.

This doesn't stop people from using Eclipse to write python, and
serves as prior art on how people cope (and programs don't cope) with
this sort of thing.

In our case we add little tokens in the code that Eclipse recognizes=20
but python doesn't, so Eclipse will shut up and python will work.

I am curious to hear how this project goes for you.  What platform
are you targeting?  What language are you using?  What PEG parser
are you trying?  No one has tried doing anything like this with
Lojban before, which makes me quite excited.

-Alan
--=20
.i ko djuno fi le do sevzi

--=20
You received this message because you are subscribed to the Google Groups "=
lojban" group.
To post to this group, send email to lojban@googlegroups.com.
To unsubscribe from this group, send email to lojban+unsubscribe@googlegrou=
ps.com.
For more options, visit this group at http://groups.google.com/group/lojban=
?hl=3Den.