Received: from mail-ig0-f188.google.com ([209.85.213.188]:65041) by stodi.digitalkingdom.org with esmtps (TLSv1:RC4-SHA:128) (Exim 4.80.1) (envelope-from ) id 1XTVpq-0002ln-IL for lojban-list-archive@lojban.org; Mon, 15 Sep 2014 05:56:51 -0700 Received: by mail-ig0-f188.google.com with SMTP id hn18sf755265igb.5 for ; Mon, 15 Sep 2014 05:56:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20120806; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :x-original-sender:x-original-authentication-results:reply-to :precedence:mailing-list:list-id:list-post:list-help:list-archive :sender:list-subscribe:list-unsubscribe:content-type; bh=pipbUMWGz2z00E2+RK90QHGo53HDwl4w8NMxy8PPMSI=; b=YU97VXV9QhdtEWbZ96Uz1kSofk37Sdx7rgP2DM8hZgXUzBHxdIID7EBKCgax+F6MlP j5UqfDXm7DR/CtzuyBpWxD8um0TgSy/H1l7KDCMcrtkPQgdGdC7z+1p6WNs1VnjXcWEA CPn0zBe/nrs2lt6zezYeMFA44DNkMrD33ttHOFCS+YGh8ifPGTPwVaesz5ssnfeMxWo+ IrvQVq3hMMXYhuJRhe0nTeka2DMQaGtL3riTp2SZ2+bXKCFPyZ2qqpVYtWy/g4SeDA+C lsLqLBsjMVD6lK1klzgDGOqNF6zrUgu6t707QlY8HhxH7Refj73EWzYjWAb94fHMWeu9 iKxg== X-Received: by 10.50.41.103 with SMTP id e7mr214387igl.8.1410785804033; Mon, 15 Sep 2014 05:56:44 -0700 (PDT) X-BeenThere: lojban@googlegroups.com Received: by 10.50.43.225 with SMTP id z1ls1967791igl.30.canary; Mon, 15 Sep 2014 05:56:43 -0700 (PDT) X-Received: by 10.68.99.225 with SMTP id et1mr15063127pbb.0.1410785803519; Mon, 15 Sep 2014 05:56:43 -0700 (PDT) Received: from mail-vc0-x234.google.com (mail-vc0-x234.google.com [2607:f8b0:400c:c03::234]) by gmr-mx.google.com with ESMTPS id cd2si192203vdc.0.2014.09.15.05.56.43 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 15 Sep 2014 05:56:43 -0700 (PDT) Received-SPF: pass (google.com: domain of jjllambias@gmail.com designates 2607:f8b0:400c:c03::234 as permitted sender) client-ip=2607:f8b0:400c:c03::234; Received: by mail-vc0-f180.google.com with SMTP id hq11so3343591vcb.11 for ; Mon, 15 Sep 2014 05:56:43 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.52.53.105 with SMTP id a9mr3278313vdp.5.1410785803343; Mon, 15 Sep 2014 05:56:43 -0700 (PDT) Received: by 10.220.204.203 with HTTP; Mon, 15 Sep 2014 05:56:43 -0700 (PDT) In-Reply-To: <5416B55B.9030302@gmx.de> References: <5415B8C0.4030003@gmx.de> <5416B55B.9030302@gmx.de> Date: Mon, 15 Sep 2014 09:56:43 -0300 Message-ID: Subject: Re: [lojban] The White Knight (Through the Looking Glass) From: =?UTF-8?Q?Jorge_Llamb=C3=ADas?= To: lojban@googlegroups.com X-Original-Sender: jjllambias@gmail.com X-Original-Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of jjllambias@gmail.com designates 2607:f8b0:400c:c03::234 as permitted sender) smtp.mail=jjllambias@gmail.com; dkim=pass header.i=@gmail.com; dmarc=pass (p=NONE dis=NONE) header.from=gmail.com Reply-To: lojban@googlegroups.com Precedence: list Mailing-list: list lojban@googlegroups.com; contact lojban+owners@googlegroups.com List-ID: X-Google-Group-Id: 1004133512417 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Content-Type: multipart/alternative; boundary=089e011847dea99dc705031a2ae3 X-Spam-Score: -1.9 (-) X-Spam_score: -1.9 X-Spam_score_int: -18 X-Spam_bar: - --089e011847dea99dc705031a2ae3 Content-Type: text/plain; charset=UTF-8 On Mon, Sep 15, 2014 at 6:46 AM, selpa'i wrote: > > Right. I think I first had {... go nai klaku gi co'e li'u}, but while that > maintains grammaticality it doesn't really correspond to being interrupted > mid-speech (or just stopping etc). Just like you I don't know what the best > general solution would be; I feel like there should be a way to have an > ungrammatical chunk inside grammatical text without making the entire text > parsefail. I don't know how one would parse human speech otherwise, which > is going to be full of incomplete sentences. Right, human speech is obviously not parsed as a single chunk. The Lojban parser is somewhat unnatural in that sense. > One thing that could help is to add a lot more productions to the fragment > rule of the grammar. Another solution I pondered involved giving EOF some > magic powers so that it can make incomplete sentences parse up to the > failure part and just treat the remainder as some sort of meaningless > left-over. What's important is that the grammatical part of such a sentence > still gets parsed properly. > It wouldn't be EOF though, because we still want to keep parsing what comes after the incomplete sentence. The PEG can be modified so as to allow incomplete sentences, but it means adding a lot of rules. I'm not sure about the equally likely case where the mistake happens in the > middle of a sentence. Perhaps an external statistical analyser would have > to guess what was meant and make corrections accordingly... > I would say mistakes are different from interruptions, so they probably require different treatments. lo'u-le'u doesn't satisfy me, because 1) it requires you to know in advance > that a sentence or text will be ungrammatical (and it's an ugly give-away > in a written story), and 2) because text in error quotes does not get > parsed, so there is no way to extract meaning from what is said. Right. The second comment is about "ba'e" in: >> >> -.i ri cmene lo selsa'a ku xu >> - na go'i .i do na jimpe .i ra ba'e cmene lo cmene >> >> Assuming the emphasis marks the rheme/comment as opposed to the >> theme/topic, I would expect the "ba'e" on the second cmene. The first >> cmene just repeats Alice's sentence, so it's not what the White Knight >> is correcting. I understand the sentence structure is somewhat different >> in Lojban than in the original, but the "ba'e" there just sounds off to >> me. >> > > I know exactly what you mean. When I read the Lojban I had the same > feeling, so I went over to the English and found that it was "backwards" as > well. {ba'e} on the second {cmene} definitely feels better, I just wasn't > sure if I should make a "correction" to the original or if it fit the > general weirdness in Alice. I don't think the original has the same problem, because in the English you have "the name" and "is called", and the White Knight's "the name" does repeat Alice's "the name", and "is called" is the new information. The problem with the Lojban is that there's two "cmene", and the one that is new information comes first and in the same position as Alice's "cmene". If it was a different word, say: -.i ri cmene lo selsa'a ku xu - na go'i .i do na jimpe .i ra ba'e sinxa lo cmene or: -.i ri cmene lo selsa'a ku xu - na go'i .i do na jimpe .i ra lo cmene cu ba'e sinxa then it would be easier to follow, because it would be more clear that "lo cmene" is "lo cmene be lo selsa'a". (I'm not saying it would be a better translation though.) Also the way you have "xu" questioning "lo selsa'a" may add to the garden-pathing. I think "vau xu" would correspond more closely to the original. mu'o mi'e xorxes -- You received this message because you are subscribed to the Google Groups "lojban" group. To unsubscribe from this group and stop receiving emails from it, send an email to lojban+unsubscribe@googlegroups.com. To post to this group, send email to lojban@googlegroups.com. Visit this group at http://groups.google.com/group/lojban. For more options, visit https://groups.google.com/d/optout. --089e011847dea99dc705031a2ae3 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

= On Mon, Sep 15, 2014 at 6:46 AM, selpa'i <seladwa@gmx.de> = wrote:

Right. I think I first had {... go nai klaku gi co'e li'u}, but whi= le that maintains grammaticality it doesn't really correspond to being = interrupted mid-speech (or just stopping etc). Just like you I don't kn= ow what the best general solution would be; I feel like there should be a w= ay to have an ungrammatical chunk inside grammatical text without making th= e entire text parsefail. I don't know how one would parse human speech = otherwise, which is going to be full of incomplete sentences.
=

Right, human speech is obviously not parsed as a single= chunk. The Lojban parser is somewhat unnatural in that sense.
= =C2=A0
One thing that could help is to add a lot more= productions to the fragment rule of the grammar. Another solution I ponder= ed involved giving EOF some magic powers so that it can make incomplete sen= tences parse up to the failure part and just treat the remainder as some so= rt of meaningless left-over. What's important is that the grammatical p= art of such a sentence still gets parsed properly.
It wouldn't be EOF though, because we still want to keep pa= rsing what comes after the incomplete sentence. The PEG can be modified so = as to allow incomplete sentences, but it means adding a lot of rules.=C2=A0=

I'm not sure about the equally likely case where the mistake happens in= the middle of a sentence. Perhaps an external statistical analyser would h= ave to guess what was meant and make corrections accordingly...

I would say mistakes are different from interrupti= ons, so they probably require different treatments.

lo'u-le'u doesn't satisfy me, because 1) it requires you to kno= w in advance that a sentence or text will be ungrammatical (and it's an= ugly give-away in a written story), and 2) because text in error quotes do= es not get parsed, so there is no way to extract meaning from what is said.=

Right.

The second comment is about "ba'e" in:

-.i ri cmene lo selsa'a ku xu
- na go'i .i do na jimpe .i ra ba'e cmene lo cmene

Assuming the emphasis marks the rheme/comment as opposed to the
theme/topic, I would expect the "ba'e" on the second cmene. T= he first
cmene just repeats Alice's sentence, so it's not what the White Kni= ght
is correcting. I understand the sentence structure is somewhat different in Lojban than in the original, but the "ba'e" there just sou= nds off to me.

I know exactly what you mean. When I read the Lojban I had the same feeling= , so I went over to the English and found that it was "backwards"= as well. {ba'e} on the second {cmene} definitely feels better, I just = wasn't sure if I should make a "correction" to the original o= r if it fit the general weirdness in Alice.

I don't think the original has the same problem, because in the Englis= h you have "the name" and "is called", and the White Kn= ight's "the name" does repeat Alice's "the name"= ;, and "is called" is the new information. The problem with the L= ojban is that there's two "cmene", and the one that is new in= formation comes first and in the same position as Alice's "cmene&q= uot;. If it was a different word, say:

-.i ri cmen= e lo selsa'a ku xu
- na go'i .i do na jimpe .i ra ba'e sinxa= lo cmene

or:

-.i ri = cmene lo selsa'a ku xu
- na go'i .i do na jimpe .i ra lo cmene c= u ba'e sinxa

then it would be easier to fo= llow, because it would be more clear that "lo cmene" is "lo = cmene be lo selsa'a". (I'm not saying it would be a better tra= nslation though.)

Also the way you have "xu&q= uot; questioning "lo selsa'a" may add to the garden-pathing. = I think "vau xu" would correspond more closely to the original.= =C2=A0

mu'o mi'e xorxes

--
You received this message because you are subscribed to the Google Groups &= quot;lojban" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to lojban+unsub= scribe@googlegroups.com.
To post to this group, send email to lojban@googlegroups.com.
Visit this group at http:= //groups.google.com/group/lojban.
For more options, visit http= s://groups.google.com/d/optout.
--089e011847dea99dc705031a2ae3--