Received: from mail-la0-f55.google.com ([209.85.215.55]:33380) by stodi.digitalkingdom.org with esmtps (TLSv1.2:AES128-GCM-SHA256:128) (Exim 4.80.1) (envelope-from ) id 1YdFUN-0003oS-W6; Wed, 01 Apr 2015 03:03:14 -0700 Received: by labgd6 with SMTP id gd6sf3067054lab.0; Wed, 01 Apr 2015 03:03:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20120806; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:x-original-sender:x-original-authentication-results :reply-to:precedence:mailing-list:list-id:list-post:list-help :list-archive:sender:list-subscribe:list-unsubscribe; bh=o+7LDe7FIE2XKN9rd4ONTUEWGaRJr16IE6m7eFIZ73A=; b=s0YjFrJlkjTKmW8e0Pv8xPlDU4CCqP8qD/LjVK+xUbpuke4bdggXg5MLdkGW4jAhlX 6IlvX4itksjlNB4hDLy5egPTD0iruqf9ZVvZhmvSZ6qvkq6u6pByPybnID2nzW8LM4AN P8nkMhaPXLKu5/HDtPMnJDQP+/Hjmg+GvKba+NXm0Ofa6/el5KhLgqCuMXFWm3ME6AlC Z3cT5jLUwmfTaZ49kNEGy1p+5nAE8cCj0nlB6MclC7L+t9qm05JF8VCudKWPJfaJddpu HgDJBAxQXkeO2Dqj5rYgr4gGnaeTAF0aX8jcqyWBV8aBuywhMfgsH1SSTdaDfohJ9XOB Dd3Q== X-Received: by 10.152.115.147 with SMTP id jo19mr16870lab.32.1427882584370; Wed, 01 Apr 2015 03:03:04 -0700 (PDT) X-BeenThere: bpfk-list@googlegroups.com Received: by 10.152.42.137 with SMTP id o9ls45609lal.65.gmail; Wed, 01 Apr 2015 03:03:03 -0700 (PDT) X-Received: by 10.152.23.7 with SMTP id i7mr2497268laf.1.1427882583761; Wed, 01 Apr 2015 03:03:03 -0700 (PDT) Received: from mail-wg0-x22e.google.com (mail-wg0-x22e.google.com. [2a00:1450:400c:c00::22e]) by gmr-mx.google.com with ESMTPS id i8si89997wif.1.2015.04.01.03.03.03 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 01 Apr 2015 03:03:03 -0700 (PDT) Received-SPF: pass (google.com: domain of gleki.is.my.name@gmail.com designates 2a00:1450:400c:c00::22e as permitted sender) client-ip=2a00:1450:400c:c00::22e; Received: by mail-wg0-x22e.google.com with SMTP id a20so47416253wgr.3 for ; Wed, 01 Apr 2015 03:03:03 -0700 (PDT) X-Received: by 10.180.126.41 with SMTP id mv9mr13546012wib.72.1427882583570; Wed, 01 Apr 2015 03:03:03 -0700 (PDT) MIME-Version: 1.0 Received: by 10.194.240.197 with HTTP; Wed, 1 Apr 2015 03:02:43 -0700 (PDT) In-Reply-To: References: From: Gleki Arxokuna Date: Wed, 1 Apr 2015 13:02:43 +0300 Message-ID: Subject: Re: [bpfk] Improvements to fragments in ilmentufa parser To: bpfk-list@googlegroups.com Content-Type: multipart/alternative; boundary=e89a8f839cb52cc3060512a6d274 X-Original-Sender: gleki.is.my.name@gmail.com X-Original-Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of gleki.is.my.name@gmail.com designates 2a00:1450:400c:c00::22e as permitted sender) smtp.mail=gleki.is.my.name@gmail.com; dkim=pass header.i=@gmail.com; dmarc=pass (p=NONE dis=NONE) header.from=gmail.com Reply-To: bpfk-list@googlegroups.com Precedence: list Mailing-list: list bpfk-list@googlegroups.com; contact bpfk-list+owners@googlegroups.com List-ID: X-Google-Group-Id: 972099695765 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , X-Spam-Score: -2.4 (--) X-Spam_score: -2.4 X-Spam_score_int: -23 X-Spam_bar: -- Content-Length: 19557 --e89a8f839cb52cc3060512a6d274 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable 2015-03-28 1:21 GMT+03:00 Jorge Llamb=C3=ADas : > > On Fri, Mar 27, 2015 at 10:34 AM, Gleki Arxokuna < > gleki.is.my.name@gmail.com> wrote: >> >> 2015-03-27 11:00 GMT+03:00 Gleki Arxokuna : >> > > >> What methods do you use or want to make the development process happen >>> faster? >>> A web tool that would allow to insert a complete PEG file, compile it >>> and test it online? >>> >> > I used this one when debugging the morphology recently: > http://pegjs.org/online > > A presentation of the grammar without the javascript would make it much > more readable for me. > > Also, a grammar without SA is much more readable than one with SA. I find > the SA-rules extremely annoying. > > bridi-tail-3 <- selbri? tail-terms / gek-sentence >>>> >>> >>> Hard for me to determine what is the cause but this breaks {mi zo'u mi >>> mo}. >>> >> >> Probably because it thinks that prenex is a selbri. >> > > Adding !ZOhU-clause at the end of tail-terms might fix that: > > tail-terms <- terms? VAU-clause? free* !ZOhU-clause > It appears that those two strings: bridi-tail-3 <- selbri? tail-terms / gek-sentence tail-terms <- terms? VAU-clause? free* !ZOhU-clause make isolate {pa} not parseable. Probably because if selbri =3D "" and tail-terms =3D "" then it goes wild. Can you check if this is true? > > What are the minimal requirements to restore a bridi if not from terms or >> from bridi_tail ? Probably it can be restored from isolated {i} or {ni'o= } >> but since this already works, then other types of restoration should be >> discussed separately since they don't touch anything here. >> > > Other than fragments, I think everything else in a text is either a > sentence, a sentence connective, or the initial indicators, free modifier= s, > and the strange initial bare cmevla. I think only fragments require > "restoration". > > >> 2. We could also add this if "fragment" is removed from the grammar: >> >> tanru_unit_1 =3D tanru_unit_2 linkargs? / linkargs? tanru_unit_2 */ >> GOhA_elidible linkargs* >> >> This makes {i be mi} parse as (i [CU {COhE } VAU]) >> > > If you're going to do that, why put it in tanru-unit-1 and not in > tanru-unit-2? > > If you allow (i [CU {COhE } VAU]), why not (i [CU {na'e > } VAU]), or (i [CU {jai } VAU]) for example? > > However, selpa'i's examples don't work here. >> >> Should {noi mo} a). be restored into {noi mo cu co'e} >> > > I missed the part where "noi mo cu co'e" became grammatical. > > >> implying {fa xi xo'e zo'e noi mo cu co'e} or b). should it instead be >> considered a continuation of the previous clause said by another speaker >> like with selpa'i's example with {be ma}? >> > > I would have said to "zo'e noi mo cu co'e" > > Both solutions seem reasonable. Maybe take option b). and treat a >> discourse split between several people as one sentence with special FUhE= .. >> FUhO markers? >> >> mi viska lo pendo FUhE [B asks] be ma [FUhO] >> mi viska lo pendo FUhE [B asks] noi mo [FUhO] >> >> A: - I see a friend. >> B: - Of whom? >> >> A: - I see a friend. >> B: - Who does what? >> >> This would reformulate fragments as parts of discourse so that we can >> remove them from the grammar. Of course, this would require somehow >> preparing existing texts by marking them with those FUhE ... FUhO so tha= t >> we can parse them. >> > > It depends on how you define "text". Is a dialogue one text, or a > succession of texts? The usual take is that it's a succession of texts, > since otherwise a lot of lojban dialogues that seem to parse would not > parse. For example the irc logs > > I also allowed relative clauses in sumti without their heads. If fragment= s >> are removed from the grammar then similar things can be useful: >> >> sumti_4 =3D expr:(sumti_5 / *relative_clauses / *gek sumti gik sumti_4) >> {return _node("sumti_4", expr);} >> > > This could be dangerous, as it makes "ta prenu poi do sisku" grammatical, > but not with the expected meaning. Also things lika {da poi prenu ku'o no= i > melbi". > > >> This results in {fa noi pendo mi cu melbi} (in fact it may even make {be= } >> useless except when used stylistically). >> > > I think it's safer to require a "lo" for bare relative clauses: "lo noi > pendo cu melbi" (this was also discussed as a good alternative to "poi'i" > in many cases). > > >> At some point there was talk of making the selbri of a sumti-tail >>>> elidable as well, so that "lo ku" would be a valid sumti. >>>> >>> >> I almost never use {ku} in this sense (LE-terminator). Besides, some >> people think that {cu} should mark the beginning of a bridi tail. In thi= s >> case I don't understand how to treat {lo cu broda}. Should it be {lo COh= E >> KU cu broda} or {lo cu broda KU CU COhE} ? >> > > Since a bridi-tail is not part of a sumti-tail, it can only be the first. > > mu'o mi'e xorxes > > -- > You received this message because you are subscribed to the Google Groups > "BPFK" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to bpfk-list+unsubscribe@googlegroups.com. > To post to this group, send email to bpfk-list@googlegroups.com. > Visit this group at http://groups.google.com/group/bpfk-list. > For more options, visit https://groups.google.com/d/optout. > --=20 You received this message because you are subscribed to the Google Groups "= BPFK" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to bpfk-list+unsubscribe@googlegroups.com. To post to this group, send email to bpfk-list@googlegroups.com. Visit this group at http://groups.google.com/group/bpfk-list. For more options, visit https://groups.google.com/d/optout. --e89a8f839cb52cc3060512a6d274 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable


2015-03-28 1:21 GMT+03:00 Jorge Llamb=C3=ADas <jjllambias@gmail.com= >:

On Fri, Mar 27, 2015 at 10:34 AM= , Gleki Arxokuna <gleki.is.my.name@gmail.com> wrote= :
2015-03-27 11:00 GMT+03:00 Gleki Arxokuna <gl= eki.is.my.name@gmail.com>:
=C2=A0
What = methods do you use or want to make the development process happen faster?
A web tool that would allow to insert a complete PEG file, compile= it and test it online?

I used this one when debu= gging the morphology recently: http://pegjs.org/online

A presenta= tion of the grammar without the javascript would make it much more readable= for me.

Also, a grammar without SA is much more r= eadable than one with SA. I find the SA-rules extremely annoying.

= =C2=A0bridi-tail-3 &l= t;- selbri? tail-terms / gek-sentence

Hard for me to determine what is the cause= but this breaks {mi zo'u mi mo}.
<= div>
Probably because it thinks that prenex is a selbr= i.=C2=A0

Add= ing !ZOhU-clause at the end of tail-terms might fix that:

=C2=A0tail-te= rms <- terms? VAU-clause? free* !ZOhU-clause

It appears that those two strings:
bridi-tail-3 <- selbri? tail-terms / gek-sentence
tail-terms &l= t;- terms? VAU-clause? free* !ZOhU-clause

make isolate {pa} not pars= eable. Probably because if selbri =3D "" and tail-terms =3D "= ;" then it goes wild.

Can you check if this i= s true?
=C2=A0

What a= re the minimal requirements to restore a bridi if not from terms or from br= idi_tail ? Probably it can be restored from isolated {i} or {ni'o} but = since this already works, then other types of restoration should be discuss= ed separately since they don't touch anything here.

Other than fragments, I think e= verything else in a text is either a sentence, a sentence connective, or th= e initial indicators, free modifiers, and the strange initial bare cmevla. = I think only fragments require "restoration".
=C2= =A0
2.=C2=A0We could also add this if "fragment= " is removed from the grammar:

tanru_uni= t_1 =3D tanru_unit_2 linkargs? / linkargs? tanru_unit_2 / GOhA_elidible = linkargs

This makes {i be mi} parse as= =C2=A0(i [CU {COhE <be mi BEhO>} VAU])=C2=A0
<= /blockquote>

If you're going to do that, why = put it in tanru-unit-1 and not in tanru-unit-2?=C2=A0

<= div>If you allow =C2=A0(i [CU {COhE <be mi BEhO>} VAU]), why not (i [= CU {na'e <COhE>} VAU]), or (i [CU {jai <COhE>} VAU]) for ex= ample?

However, selpa'i'= ;s examples don't work here.

Should {noi mo} a= ). be restored into {noi mo cu co'e}

I missed the part where "noi mo cu co= 9;e" became grammatical.
=C2=A0
=
implying {fa xi xo'e zo'e noi mo cu co'e} or b). should it ins= tead be considered a continuation of the previous clause said by another sp= eaker like with selpa'i's example with {be ma}?

I would have said to "zo&#= 39;e noi mo cu co'e"=C2=A0

=
Both solutions seem reasonable. Maybe take option b). and treat a disc= ourse split between several people as one sentence with special FUhE .. FUh= O markers?

mi viska lo pendo FUhE [B asks] be ma [= FUhO]
mi viska lo pendo FUhE [B asks] noi mo [FUhO]

A: - I see a friend.
B: - Of whom?

=
A: - I see a friend.
B: - Who does what?
This would reformulate fragments as parts of discourse so that = we can remove them from the grammar. Of course, this would require somehow = preparing existing texts by marking them with those FUhE ... FUhO so that w= e can parse them.

It depends on how you define "text". Is a dialogue one text= , or a succession of texts? The usual take is that it's a succession of= texts, since otherwise a lot of lojban dialogues that seem to parse would = not parse. For example the irc logs=C2=A0

I also allowed relative clauses in sumti without their heads. If= fragments are removed from the grammar then similar things can be useful:<= /div>

sumti_4 =3D expr:(sumti_5 / relative_claus= es / gek sumti gik sumti_4) {return _node("sumti_4", expr);}<= /div>

This c= ould be dangerous, as it makes "ta prenu poi do sisku" grammatica= l, but not with the expected meaning. Also things lika {da poi prenu ku'= ;o noi melbi".
=C2=A0
This re= sults in {fa noi pendo mi cu melbi} (in fact it may even make {be} useless = except when used stylistically).
<= br>
I think it's safer to require a "lo" for= bare relative clauses: "lo noi pendo cu melbi" (this was also di= scussed as a good alternative to "poi'i" in many cases).
<= /div>
=C2=A0
At some point t= here was talk of making the selbri of a sumti-tail elidable as well, so tha= t "lo ku" would be a valid sumti.
<= /blockquote>

I almost never use {ku} in this sense (LE-terminator). Besides, some peop= le think that {cu} should mark the beginning of a bridi tail. In this case = I don't understand how to treat {lo cu broda}. Should it be {lo COhE KU= cu broda} or {lo cu broda KU CU COhE} ?

Since a bridi-tail is not part of a sumti-tail= , it can only be the first.
=C2=A0
mu'o mi= 9;e xorxes

--
You received this message because you are subscribed to the Google Groups &= quot;BPFK" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to bpfk-list+unsubscribe@googlegroups.com.
To post to this group, send email to bpfk-list@googlegroups.com.
Visit this group at http://groups.google.com/group/bpfk-list.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups &= quot;BPFK" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to bpfk-list= +unsubscribe@googlegroups.com.
To post to this group, send email to bpfk-list@googlegroups.com.
Visit this group at ht= tp://groups.google.com/group/bpfk-list.
For more options, visit http= s://groups.google.com/d/optout.
--e89a8f839cb52cc3060512a6d274--