Received: from mail-lb0-f183.google.com ([209.85.217.183]:36539) by stodi.digitalkingdom.org with esmtps (TLSv1.2:AES128-GCM-SHA256:128) (Exim 4.80.1) (envelope-from ) id 1YLmzf-0001JP-Bp for lojban-list-archive@lojban.org; Wed, 11 Feb 2015 22:11:28 -0800 Received: by mail-lb0-f183.google.com with SMTP id p9sf1914458lbv.0 for ; Wed, 11 Feb 2015 22:11:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20120806; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:x-original-sender:x-original-authentication-results :reply-to:precedence:mailing-list:list-id:list-post:list-help :list-archive:sender:list-subscribe:list-unsubscribe; bh=DA1hnHgJwxRmt0MYs2fDEtxvXYplaLjVpf0Vjp1Lyh4=; b=P04VdRj4HYrm7NtDAtLBrmelbYRsd0DEshH7l40XdYwOORE2EM6wAEI7OsAWO1GziM 3IVlqHIi1bQ6qa/u7DVJMqZZeEYHOF4ASSjO8tP1cQ5ld20k35qqkItSFTYGj2osmP3Y 1kL4zQrPYXfj1zuz7LmokI8F2XcRwnrLELtS+rlRg+wlT13XlF4sqi0OWUNalU5A6hpa 96M1dUdSr5i2Q+pDFst/5o3WB4Txdx/rOzpSnUnPLmwPmmzPRMXlK9RciIhEyCxheeqD MEZQ3ME6ukySccw0tSAU0sZNfYsOtOFGD3Df6mOiSMaK7pnrgXI0YczB9yfz2THjO3Xt /usA== X-Received: by 10.152.37.226 with SMTP id b2mr30540lak.15.1423721470814; Wed, 11 Feb 2015 22:11:10 -0800 (PST) X-BeenThere: lojban@googlegroups.com Received: by 10.153.6.7 with SMTP id cq7ls157000lad.8.gmail; Wed, 11 Feb 2015 22:11:09 -0800 (PST) X-Received: by 10.152.87.15 with SMTP id t15mr276660laz.9.1423721469960; Wed, 11 Feb 2015 22:11:09 -0800 (PST) Received: from mail-wi0-x22f.google.com (mail-wi0-x22f.google.com. [2a00:1450:400c:c05::22f]) by gmr-mx.google.com with ESMTPS id o9si587491wiw.0.2015.02.11.22.11.09 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 11 Feb 2015 22:11:09 -0800 (PST) Received-SPF: pass (google.com: domain of gleki.is.my.name@gmail.com designates 2a00:1450:400c:c05::22f as permitted sender) client-ip=2a00:1450:400c:c05::22f; Received: by mail-wi0-f175.google.com with SMTP id r20so1624941wiv.2 for ; Wed, 11 Feb 2015 22:11:09 -0800 (PST) X-Received: by 10.180.149.242 with SMTP id ud18mr2840977wib.94.1423721469733; Wed, 11 Feb 2015 22:11:09 -0800 (PST) MIME-Version: 1.0 Received: by 10.194.86.200 with HTTP; Wed, 11 Feb 2015 22:10:49 -0800 (PST) In-Reply-To: <50d5006f-f02b-4a28-9894-6608729585fc@googlegroups.com> References: <20150204124517.GA1243@kuebelreiter.informatik.Uni-Osnabrueck.DE> <50d5006f-f02b-4a28-9894-6608729585fc@googlegroups.com> From: Gleki Arxokuna Date: Thu, 12 Feb 2015 09:10:49 +0300 Message-ID: Subject: Re: [lojban] the myth of monoparsing To: "lojban@googlegroups.com" Content-Type: multipart/alternative; boundary=001a11c269587672bb050eddfcef X-Original-Sender: gleki.is.my.name@gmail.com X-Original-Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of gleki.is.my.name@gmail.com designates 2a00:1450:400c:c05::22f as permitted sender) smtp.mail=gleki.is.my.name@gmail.com; dkim=pass header.i=@gmail.com; dmarc=pass (p=NONE dis=NONE) header.from=gmail.com Reply-To: lojban@googlegroups.com Precedence: list Mailing-list: list lojban@googlegroups.com; contact lojban+owners@googlegroups.com List-ID: X-Google-Group-Id: 1004133512417 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , X-Spam-Score: -1.9 (-) X-Spam_score: -1.9 X-Spam_score_int: -18 X-Spam_bar: - --001a11c269587672bb050eddfcef Content-Type: text/plain; charset=UTF-8 2015-02-12 1:20 GMT+03:00 ianek : > > > On Wednesday, February 11, 2015 at 1:50:49 PM UTC+1, la gleki wrote: > >> >> >> 2015-02-09 23:22 GMT+03:00 ianek : >> >>> >>> >>> On Monday, February 9, 2015 at 11:54:41 AM UTC+1, la gleki wrote: >>>> >>>> >>>> >>>> 2015-02-08 4:34 GMT+03:00 ianek : >>>> >>>>> >>>>> >>>>> On Friday, February 6, 2015 at 8:13:30 AM UTC+1, la gleki wrote: >>>>>> >>>>>> >>>>>> >>>>>> 2015-02-04 15:45 GMT+03:00 v4hn : >>>>>> >>>>>>> On Tue, Feb 03, 2015 at 11:42:32AM +0300, Gleki Arxokuna wrote: >>>>>>> > "Fred saw a plane flying over Zurich" can have several meanings >>>>>>> >>>>>>> Yes. >>>>>>> However, for me, the issue here is that we (hopefully..) agree >>>>>>> that there are different parse trees (which yield the different >>>>>>> meanings). >>>>>>> >>>>>> >>>>>> No, several trees arise after you interpret the sentence. >>>>>> >>>>> >>>>> But if you had an English parser, it would yield several trees without >>>>> any interpreting. >>>>> >>>> >>>> Sure! Because English parsers lack the ability to find something common >>>> in all of the parse trees. >>>> >>> >>> No. It's because words in an English sentence can be parsed as different >>> syntactic structures. That's what parsing means: determining structures >>> formed by words. Not "finding something common". >>> >> >> You yourself just showed several parses of the same sentence. >> This is how usual English parsers are constructed. >> >> However, there is another option to monoparse this English sentence. >> >> You mix English language and one current theory of how to parse it. >> >> >>> >>>> >>>> >>>>> Like this: >>>>> >>>>> "Fred saw a plane flying over Zurich" >>>>> NAME VERB-PAST ARTICLE COUNTABLE-NOUN VERB-ING PREPOSITION NAME >>>>> >>>>> Some (much simplified) rules could be: >>>>> >>>>> Sentence ::= Noun-Phrase Verb Noun-Phrase >>>>> Sentence ::= Noun-Phrase Verb Noun-Phrase Adverbial-Phrase >>>>> Noun-Phrase ::= NAME | ARTICLE COUNTABLE-NOUN | Noun-Phrase VERB-ING >>>>> Prepositional-Clause >>>>> Verb ::= VERB-PAST >>>>> Adverbial-Phrase ::= VERB-ING Preposition-Clause >>>>> Preposition-Clause ::= PREPOSITION Noun-Phrase >>>>> >>>>> This simple grammar yields two parse trees for that sentence: >>>>> >>>>> Sentence >>>>> ----Noun-Phrase >>>>> --------NAME >>>>> ------------Fred >>>>> ----Verb >>>>> --------VERB-PAST >>>>> ------------saw >>>>> ----Noun-Phrase >>>>> --------Noun-Phrase >>>>> ------------ARTICLE >>>>> ----------------a >>>>> ------------NOUN >>>>> ----------------plane >>>>> --------VERB-ING >>>>> ------------flying >>>>> --------Prepositional-Clause >>>>> ------------PROPOSITION >>>>> ----------------over >>>>> ------------Noun-Phrase >>>>> ----------------NAME >>>>> --------------------Zurich >>>>> >>>>> Sentence >>>>> ----Noun-Phrase >>>>> --------NAME >>>>> ------------Fred >>>>> ----Verb >>>>> --------VERB-PAST >>>>> ------------saw >>>>> ----Noun-Phrase >>>>> --------Noun-Phrase >>>>> ------------ARTICLE >>>>> ----------------a >>>>> ------------NOUN >>>>> ----------------plane >>>>> ----Adverbial-Phrase >>>>> --------VERB-ING >>>>> ------------flying >>>>> --------Prepositional-Clause >>>>> ------------PROPOSITION >>>>> ----------------over >>>>> ------------Noun-Phrase >>>>> ----------------NAME >>>>> --------------------Zurich >>>>> >>>>> Formal grammars for natural languages do exist, although they're not >>>>> perfect, but the problem with multiple grammatically sensible parses (often >>>>> millions of trees and more) is much greater than the problem with >>>>> nonsensible trees or correct sentences that don't parse at all. >>>>> >>>>> Lojban was carefully designed to avoid this problem. And it doesn't >>>>> have anything to do with {xi PA}. The Lojban grammar specifies XI clauses >>>>> unambiguously. Parse trees are unique. Monoparsing is not a myth. XI >>>>> clauses may add semantic ambiguity on a different level then, say, simple >>>>> {zo'e}, but it doesn't have anything to do with syntactic ambiguity. >>>>> >>>> >>>> It specifies to which head a clause should attach. And since it's {mo'e >>>> zo'e} it's vague to which head it attaches. If the parser you use doesn't >>>> allow for that the only thing that can be done is to provide several >>>> possible trees. >>>> >>> >>> It's a feature of a language, not a parser. If English had a pronoun, >>> say, 'lar', which would mean 'the subject or the object of the main >>> sentence', you could say "Fred saw a plane as lar flew over Zurich", which >>> would be ambiguous semantically, but not syntactically. >>> >> >> Even in current English theory there are a lot of zero morphemes. What >> I'm proposing is just another zero morpheme. >> > > >> >> This is what And agreed with me. >> >> >>> >>>> >>>>> >>>> {la fred pu viska lo vinji do'e lo se xi vei mo'e zo'e nei poi vofli >>>>> ga'u la tsurix} has only one syntax tree, regardless of the number of >>>>> possible semantic interpretations. >>>>> >>>> >>>> If you applied {mo'e zo'e} to the English sentence you will still get >>>> the only syntax tree. >>>> >>> >>> You can't "apply" {mo'e zo'e} to the English sentence, because it's not >>> there. Likewise you don't "apply" {mo'e zo'e} to the Lojban sentence. You >>> just parse it, because it's there. >>> In English you can have phrases like 'X of Y of Z' which could be parsed >>> as '(X of Y) of Z' or 'X of (Y of Z)'. In Lojban it's not possible, but you >>> can say ''either (X of Y) of Z or X of (Y of Z)", which is not >>> syntactically ambiguous. You can't apply "either... or" to the English >>> sentence, because you can't parse words which aren't there. >>> >> >> As I just said English parsers use this "add words that aren't there" >> all the time. >> > > I was searching, but I haven't found any English parser (but I know a > Polish one). What parsers do you refer to? > Probably most. Since this concept (of adding words and morphemes of zero length) is present in most modern theories: https://en.wikipedia.org/wiki/Zero_(linguistics) > > >> >> >>> >>>> >>>>> In English you can have sentences that are semantically ambiguous due >>>>> to syntactic ambiguity. In Lojban you can have sentences with (roughly) the >>>>> same semantic ambiguity as the English ones, but syntactically unambiguous. >>>>> >>>>> >>>>>> >>>>>>> > {la fred pu viska lo vinji do'e lo se xi vei mo'e zo'e nei poi >>>>>>> vofli ga'u >>>>>>> > la tsurix} >>>>>>> >>>>>>> camxes only produces one parse tree for that. >>>>>>> >>>>>> >>>>>> And for English you don't provide any parses at all. >>>>>> May be someone should just parse the original English sentence as >>>>>> camxes does for Lojban one? >>>>>> I won't be surprised if such parser for English doesn't exist since >>>>>> those who write them might mix parsing and interpretation of it. The latter >>>>>> would be replacing {mo'e zo'e} with some PA which will immediately lead to >>>>>> several syntactic trees. >>>>>> >>>>>> So I both disagree and agree with you on whether English sentence has >>>>>> several syntactic trees. If using one term for two operations is stopped >>>>>> the contradiction disappears. >>>>>> >>>>>> >>>>>> >>>>>>> If you think it should produce more then one, raise a bug report. >>>>>>> >>>>>> >>>>>> I'm not aware of any Lojban parsers that perform interpretation >>>>>> operation. In most cases you just need context and one interpretation. But >>>>>> this is semantic analysis. Producing all possible syntactic trees is a task >>>>>> needed more seldom. >>>>>> >>>>> >>>>> Camxes is intended to produce all possible syntactic trees, and >>>>> there's only one of them for any valid sentence. >>>>> >>>> >>>> You may invent a Lojban parser that won't be able to parse {mo'e zo'e}. >>>> Then you will need workarounds to output several trees. >>>> >>> >>> XI clauses have an ambiguous syntax, so I don't see how I'd need >>> workarounfds and several trees. Of course, I could invent a Lojban parser >>> that won't be able to parse anything, but what's the point? {mo'e zo'e} >>> from the parser's view is just MOhE KOhA. If I can't parse it, then I have >>> an incomplete parser. >>> >> >> And this is what I state for English: its current parsers are incomplete >> and further improvements will make polyparsed sentences monoparsed. >> >> >>> >>> What you mean sounds rather like a semantic analyzer, which is extremely >>> hard for any language, including Lojban. >>> >>> mu'o mi'e ianek >>> >>> >>>> >>>> >>>>> >>>>> mu'o mi'e ianek >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "lojban" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to lojban+un...@googlegroups.com. >>>>> To post to this group, send email to loj...@googlegroups.com. >>>>> Visit this group at http://groups.google.com/group/lojban. >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "lojban" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to lojban+un...@googlegroups.com. >>> To post to this group, send email to loj...@googlegroups.com. >>> Visit this group at http://groups.google.com/group/lojban. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- > You received this message because you are subscribed to the Google Groups > "lojban" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to lojban+unsubscribe@googlegroups.com. > To post to this group, send email to lojban@googlegroups.com. > Visit this group at http://groups.google.com/group/lojban. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "lojban" group. To unsubscribe from this group and stop receiving emails from it, send an email to lojban+unsubscribe@googlegroups.com. To post to this group, send email to lojban@googlegroups.com. Visit this group at http://groups.google.com/group/lojban. For more options, visit https://groups.google.com/d/optout. --001a11c269587672bb050eddfcef Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable


2015-02-12 1:20 GMT+03:00 ianek <janek37@gmail.com>:
=


On Wednesday, February 11, 2015 a= t 1:50:49 PM UTC+1, la gleki wrote:
=


2015-02-09 23:22 G= MT+03:00 ianek <jan...@gmail.com>:


On Monday, Febr= uary 9, 2015 at 11:54:41 AM UTC+1, la gleki wrote:
=


2015-02-08 4:34 GM= T+03:00 ianek <jan...@gmail.com>:


On Friday, Febru= ary 6, 2015 at 8:13:30 AM UTC+1, la gleki wrote:


2015-02-04 15:45 GMT+0= 3:00 v4hn <m...@v4hn.de>:
On Tue, Feb 03, 2015 at 11:42:32AM +0300, G= leki Arxokuna wrote:
> "Fred saw a plane flying over Zurich" can have several meani= ngs

Yes.
However, for me, the issue here is that we (hopefully..) agree
that there are different parse trees (which yield the different meanings).<= br>

No, several trees arise after you inter= pret the sentence.

But = if you had an English parser, it would yield several trees without any inte= rpreting.

Sure! Because English= parsers lack the ability to find something common in all of the parse tree= s.

No. It's because= words in an English sentence can be parsed as different syntactic structur= es. That's what parsing means: determining structures formed by words. = Not "finding something common".
=
You yourself just showed several parses of the same sentence= .
This is how usual English parsers are constructed.=C2=A0
<= div>
However, there is another option to monoparse this Engli= sh sentence.

You mix English language and one curr= ent theory of how to parse it.

=C2=A0
=C2=A0
Like this:

"Fred saw a plane flying over Z= urich"
NAME VERB-PAST ARTICLE COUNTABLE-NOUN VERB-ING PREPOS= ITION NAME

Some (much simplified) rules could be:

Sentence ::= =3D Noun-Phrase Verb Noun-Phrase
Sentence ::=3D Noun-Phrase Verb Noun-Ph= rase Adverbial-Phrase
Noun-Phrase ::=3D NAME | ARTICLE COUNTABLE-NOUN | = Noun-Phrase VERB-ING Prepositional-Clause
Verb ::=3D VERB-PAST
Adverb= ial-Phrase ::=3D VERB-ING Preposition-Clause
Preposition-Clause ::=3D PR= EPOSITION Noun-Phrase

This simple grammar yields two parse trees for= that sentence:

Sentence
----Noun-Phrase
--------NAME
-----= -------Fred
----Verb
--------VERB-PAST
------------saw
----Noun= -Phrase
--------Noun-Phrase
------------ARTICLE
----------------a<= br>------------NOUN
----------------plane
--------VERB-ING
-------= -----flying
--------Prepositional-Clause
------------PROPOSITION
-= ---------------over
------------Noun-Phrase
----------------NAME
-= -------------------Zurich

Sentence
----Noun-Phrase
--------NAM= E
------------Fred
----Verb
--------VERB-PAST
------------saw----Noun-Phrase
--------Noun-Phrase
------------ARTICLE
--------= --------a
------------NOUN
----------------plane
----Adverbial-Phr= ase
--------VERB-ING
------------flying
--------Prepositional-Clau= se
------------PROPOSITION
----------------over
------------Noun-P= hrase
----------------NAME
--------------------Zurich

Formal g= rammars for natural languages do exist, although they're not perfect, b= ut the problem with multiple grammatically sensible parses (often millions = of trees and more) is much greater than the problem with nonsensible trees = or correct sentences that don't parse at all.

Lojban was careful= ly designed to avoid this problem. And it doesn't have anything to do w= ith {xi PA}. The Lojban grammar specifies XI clauses unambiguously. Parse t= rees are unique. Monoparsing is not a myth. XI clauses may add semantic amb= iguity on a different level then, say, simple {zo'e}, but it doesn'= t have anything to do with syntactic ambiguity.

It specifies to which head a clause should attach. A= nd since it's {mo'e zo'e} it's vague to which head it attac= hes. If the parser you use doesn't allow for that the only thing that c= an be done is to provide several possible trees.

It's a feature of a language, not a = parser. If English had a pronoun, say, 'lar', which would mean '= ;the subject or the object of the main sentence', you could say "F= red saw a plane as lar flew over Zurich", which would be ambiguous sem= antically, but not syntactically.

Even in current English theory there are a lot of zero morphemes. Wh= at I'm proposing is just another zero morpheme.
=
=C2=A0

This is what And agreed with me.


<= blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-l= eft-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;pa= dding-left:1ex">

=C2=A0
{la fred = pu viska lo vinji do'e lo se xi vei mo'e zo'e nei poi vofli ga&= #39;u la tsurix} has only one syntax tree, regardless of the number of poss= ible semantic interpretations.

<= /div>
If you applied {mo'e zo'e} to the English sentence you wi= ll still get the only syntax tree.
=

You can't "apply" {mo'e zo'e} to the = English sentence, because it's not there. Likewise you don't "= apply" {mo'e zo'e} to the Lojban sentence. You just parse it, = because it's there.
In English you can have phrases like 'X of Y= of Z' which could be parsed as '(X of Y) of Z' or 'X of (Y= of Z)'. In Lojban it's not possible, but you can say ''eit= her (X of Y) of Z or X of (Y of Z)", which is not syntactically ambigu= ous. You can't apply "either... or" to the English sentence, = because you can't parse words which aren't there.

As I just said English parsers use this &quo= t;add words that aren't there" =C2=A0all the time.

I was searching, but I haven'= t found any English parser (but I know a Polish one). What parsers do you r= efer to?

Probably most. Sin= ce this concept (of adding words and morphemes of zero length) is present i= n most modern theories:
=C2=A0
<= div>


=

In English you can have sen= tences that are semantically ambiguous due to syntactic ambiguity. In Lojba= n you can have sentences with (roughly) the same semantic ambiguity as the = English ones, but syntactically unambiguous.
=C2=A0

> {la fred pu viska lo vinji do'e lo se xi vei mo'e zo'e nei= poi vofli ga'u
> la tsurix}

camxes only produces one parse tree for that.
<= br>
And for English you don't provide any parses at all.
May be someone should just parse the original English sentence as cam= xes does for Lojban one?
I won't be surprised if such parser = for English doesn't exist since those who write them might mix parsing = and interpretation of it. The latter would be replacing {mo'e zo'e}= with some PA which will immediately lead to several syntactic trees.
=

So I both disagree and agree with you on whether Englis= h sentence has several syntactic trees. If using one term for two operation= s is stopped the contradiction disappears.

=C2=A0<= /div>
If you think it should produce more then one, raise a bug report.

I'm not aware of any Lojban parsers that per= form interpretation operation. In most cases you just need context and one = interpretation. But this is semantic analysis. Producing all possible synta= ctic trees is a task needed more seldom.

Camxes is intended to produce all possible syntactic tree= s, and there's only one of them for any valid sentence.
=

You may invent a Lojban parser that won= 9;t be able to parse {mo'e zo'e}. Then you will need workarounds to= output several trees.

= XI clauses have an ambiguous syntax, so I don't see how I'd need wo= rkarounfds and several trees. Of course, I could invent a Lojban parser tha= t won't be able to parse anything, but what's the point? {mo'e = zo'e} from the parser's view is just MOhE KOhA. If I can't pars= e it, then I have an incomplete parser.
And this is what I state for English: its current parsers are = incomplete and further improvements will make polyparsed sentences monopars= ed.
=C2=A0

What y= ou mean sounds rather like a semantic analyzer, which is extremely hard for= any language, including Lojban.

mu'o mi'e ianek
= =C2=A0
=C2=A0
mu'o mi'e ianek

--
You received this message because you are subscribed to the Google Groups &= quot;lojban" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to lojban+un...@googlegroups.com.
To post to this group, send email to loj...@googlegroup= s.com.
Visit this group at http://groups.google.com/group/lojban.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups &= quot;lojban" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to lojban+un...@googlegroups.com.
To post to this group, send email to loj...@googlegroup= s.com.
Visit this group at http://groups.google.com/group/lojba= n.
For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to the Google Groups &= quot;lojban" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to lojban+unsub= scribe@googlegroups.com.
To post to this group, send email to lojban@googlegroups.com.
Visit this group at http:= //groups.google.com/group/lojban.
For more options, visit http= s://groups.google.com/d/optout.
--001a11c269587672bb050eddfcef--