Received: from mail-la0-f59.google.com ([209.85.215.59]:34229) by stodi.digitalkingdom.org with esmtps (TLSv1.2:AES128-GCM-SHA256:128) (Exim 4.80.1) (envelope-from ) id 1Ybcd3-0005LD-DI; Fri, 27 Mar 2015 15:21:29 -0700 Received: by lams18 with SMTP id s18sf37086846lam.1; Fri, 27 Mar 2015 15:21:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20120806; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:x-original-sender:x-original-authentication-results :reply-to:precedence:mailing-list:list-id:list-post:list-help :list-archive:sender:list-subscribe:list-unsubscribe; bh=gl6CnLzxAoDuP3P2TFYKwQbzFmFBiyM70l2hc9uJEgg=; b=tCOVVclU/0wude9EufsSFZZlBMSYTz/9+KL4eQOWhoGLR57ZugSC6NwL20s18zh74s 9gVU0dGoLMi73d3ocx1Cs2Nqe9e5uRJjnkbFwyjj8aFSZCZYIOJd/O1m6eiKRqNPnLT6 2nuABzZGBeQnWk/iSSuYbwOBsXzCwVUI1yhB5Z7RRIuE0x0NBsPJNcoJTWIk2h5Ov4t2 pzZd2HIAWE5uqwa0HjbYybeYs0zzobFK11ZfEC21oTIv7q/QyAwF2kNz0uhI0R91f+MR +cPCTIYoa5lXR6PMmNEsHUi0G8GejhAGO/iPUXp9opDUGPRYS4DSziVlKY9GpPukdSBP uEEw== X-Received: by 10.180.24.225 with SMTP id x1mr10034wif.19.1427494878037; Fri, 27 Mar 2015 15:21:18 -0700 (PDT) X-BeenThere: bpfk-list@googlegroups.com Received: by 10.180.20.106 with SMTP id m10ls256539wie.39.gmail; Fri, 27 Mar 2015 15:21:17 -0700 (PDT) X-Received: by 10.194.71.227 with SMTP id y3mr1268096wju.3.1427494877480; Fri, 27 Mar 2015 15:21:17 -0700 (PDT) Received: from mail-wi0-x22e.google.com (mail-wi0-x22e.google.com. [2a00:1450:400c:c05::22e]) by gmr-mx.google.com with ESMTPS id gj20si221917wic.1.2015.03.27.15.21.17 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 27 Mar 2015 15:21:17 -0700 (PDT) Received-SPF: pass (google.com: domain of jjllambias@gmail.com designates 2a00:1450:400c:c05::22e as permitted sender) client-ip=2a00:1450:400c:c05::22e; Received: by mail-wi0-x22e.google.com with SMTP id a2so48854579wia.0 for ; Fri, 27 Mar 2015 15:21:17 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.194.94.164 with SMTP id dd4mr42433205wjb.56.1427494877375; Fri, 27 Mar 2015 15:21:17 -0700 (PDT) Received: by 10.27.86.219 with HTTP; Fri, 27 Mar 2015 15:21:17 -0700 (PDT) In-Reply-To: References: Date: Fri, 27 Mar 2015 19:21:17 -0300 Message-ID: Subject: Re: [bpfk] Improvements to fragments in ilmentufa parser From: =?UTF-8?Q?Jorge_Llamb=C3=ADas?= To: bpfk-list@googlegroups.com Content-Type: multipart/alternative; boundary=047d7bb0410215b19205124c8d54 X-Original-Sender: jjllambias@gmail.com X-Original-Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of jjllambias@gmail.com designates 2a00:1450:400c:c05::22e as permitted sender) smtp.mail=jjllambias@gmail.com; dkim=pass header.i=@gmail.com; dmarc=pass (p=NONE dis=NONE) header.from=gmail.com Reply-To: bpfk-list@googlegroups.com Precedence: list Mailing-list: list bpfk-list@googlegroups.com; contact bpfk-list+owners@googlegroups.com List-ID: X-Google-Group-Id: 972099695765 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , X-Spam-Score: -1.7 (-) X-Spam_score: -1.7 X-Spam_score_int: -16 X-Spam_bar: - --047d7bb0410215b19205124c8d54 Content-Type: text/plain; charset=UTF-8 On Fri, Mar 27, 2015 at 10:34 AM, Gleki Arxokuna wrote: > > 2015-03-27 11:00 GMT+03:00 Gleki Arxokuna : > > What methods do you use or want to make the development process happen >> faster? >> A web tool that would allow to insert a complete PEG file, compile it and >> test it online? >> > I used this one when debugging the morphology recently: http://pegjs.org/online A presentation of the grammar without the javascript would make it much more readable for me. Also, a grammar without SA is much more readable than one with SA. I find the SA-rules extremely annoying. bridi-tail-3 <- selbri? tail-terms / gek-sentence >>> >> >> Hard for me to determine what is the cause but this breaks {mi zo'u mi >> mo}. >> > > Probably because it thinks that prenex is a selbri. > Adding !ZOhU-clause at the end of tail-terms might fix that: tail-terms <- terms? VAU-clause? free* !ZOhU-clause What are the minimal requirements to restore a bridi if not from terms or > from bridi_tail ? Probably it can be restored from isolated {i} or {ni'o} > but since this already works, then other types of restoration should be > discussed separately since they don't touch anything here. > Other than fragments, I think everything else in a text is either a sentence, a sentence connective, or the initial indicators, free modifiers, and the strange initial bare cmevla. I think only fragments require "restoration". > 2. We could also add this if "fragment" is removed from the grammar: > > tanru_unit_1 = tanru_unit_2 linkargs? / linkargs? tanru_unit_2 */ > GOhA_elidible linkargs* > > This makes {i be mi} parse as (i [CU {COhE } VAU]) > If you're going to do that, why put it in tanru-unit-1 and not in tanru-unit-2? If you allow (i [CU {COhE } VAU]), why not (i [CU {na'e } VAU]), or (i [CU {jai } VAU]) for example? However, selpa'i's examples don't work here. > > Should {noi mo} a). be restored into {noi mo cu co'e} > I missed the part where "noi mo cu co'e" became grammatical. > implying {fa xi xo'e zo'e noi mo cu co'e} or b). should it instead be > considered a continuation of the previous clause said by another speaker > like with selpa'i's example with {be ma}? > I would have said to "zo'e noi mo cu co'e" Both solutions seem reasonable. Maybe take option b). and treat a discourse > split between several people as one sentence with special FUhE .. FUhO > markers? > > mi viska lo pendo FUhE [B asks] be ma [FUhO] > mi viska lo pendo FUhE [B asks] noi mo [FUhO] > > A: - I see a friend. > B: - Of whom? > > A: - I see a friend. > B: - Who does what? > > This would reformulate fragments as parts of discourse so that we can > remove them from the grammar. Of course, this would require somehow > preparing existing texts by marking them with those FUhE ... FUhO so that > we can parse them. > It depends on how you define "text". Is a dialogue one text, or a succession of texts? The usual take is that it's a succession of texts, since otherwise a lot of lojban dialogues that seem to parse would not parse. For example the irc logs I also allowed relative clauses in sumti without their heads. If fragments > are removed from the grammar then similar things can be useful: > > sumti_4 = expr:(sumti_5 / *relative_clauses / *gek sumti gik sumti_4) > {return _node("sumti_4", expr);} > This could be dangerous, as it makes "ta prenu poi do sisku" grammatical, but not with the expected meaning. Also things lika {da poi prenu ku'o noi melbi". > This results in {fa noi pendo mi cu melbi} (in fact it may even make {be} > useless except when used stylistically). > I think it's safer to require a "lo" for bare relative clauses: "lo noi pendo cu melbi" (this was also discussed as a good alternative to "poi'i" in many cases). > At some point there was talk of making the selbri of a sumti-tail elidable >>> as well, so that "lo ku" would be a valid sumti. >>> >> > I almost never use {ku} in this sense (LE-terminator). Besides, some > people think that {cu} should mark the beginning of a bridi tail. In this > case I don't understand how to treat {lo cu broda}. Should it be {lo COhE > KU cu broda} or {lo cu broda KU CU COhE} ? > Since a bridi-tail is not part of a sumti-tail, it can only be the first. mu'o mi'e xorxes -- You received this message because you are subscribed to the Google Groups "BPFK" group. To unsubscribe from this group and stop receiving emails from it, send an email to bpfk-list+unsubscribe@googlegroups.com. To post to this group, send email to bpfk-list@googlegroups.com. Visit this group at http://groups.google.com/group/bpfk-list. For more options, visit https://groups.google.com/d/optout. --047d7bb0410215b19205124c8d54 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

= On Fri, Mar 27, 2015 at 10:34 AM, Gleki Arxokuna <gleki.is.my.nam= e@gmail.com> wrote:
2015-03-27 1= 1:00 GMT+03:00 Gleki Arxokuna <gleki.is.my.name@gmail.com>= :
=C2=A0
What methods do you use or want to make= the development process happen faster?
A web tool that would all= ow to insert a complete PEG file, compile it and test it online?

I used this one when debugging the morphology recently: http://pegjs.org/online

<= div>A presentation of the grammar without the javascript would make it much= more readable for me.

Also, a grammar without SA = is much more readable than one with SA. I find the SA-rules extremely annoy= ing.

<= blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-l= eft-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;pa= dding-left:1ex">
=C2=A0bridi-tail-3 <- selbri? tail-terms / gek-sentence

Hard for me to determine = what is the cause but this breaks {mi zo'u mi mo}.

Probably because it thinks that = prenex is a selbri.=C2=A0

Adding !ZOhU-clause at the end of tail-terms might fix that:

=C2=A0tail-terms <- terms? VAU-clause? free* !ZOhU-clause
=
What are the minimal requirements to restore a bridi if not f= rom terms or from bridi_tail ? Probably it can be restored from isolated {i= } or {ni'o} but since this already works, then other types of restorati= on should be discussed separately since they don't touch anything here.=

Other than fragmen= ts, I think everything else in a text is either a sentence, a sentence conn= ective, or the initial indicators, free modifiers, and the strange initial = bare cmevla. I think only fragments require "restoration".
<= div>=C2=A0
2.=C2=A0We could also add this if "f= ragment" is removed from the grammar:

ta= nru_unit_1 =3D tanru_unit_2 linkargs? / linkargs? tanru_unit_2 / GOhA_el= idible linkargs

This makes {i be mi} par= se as=C2=A0(i [CU {COhE <be mi BEhO>} VAU])=C2=A0

If you're going to do that, why pu= t it in tanru-unit-1 and not in tanru-unit-2?=C2=A0

If you allow =C2=A0(i [CU {COhE <be mi BEhO>} VAU]), why not (i [CU= {na'e <COhE>} VAU]), or (i [CU {jai <COhE>} VAU]) for exam= ple?

However, selpa'i's examp= les don't work here.

Should {noi mo} a). be re= stored into {noi mo cu co'e}
=
I missed the part where "noi mo cu co'e" becam= e grammatical.
=C2=A0
implying {fa xi xo&#= 39;e zo'e noi mo cu co'e} or b). should it instead be considered a = continuation of the previous clause said by another speaker like with selpa= 'i's example with {be ma}?

I would have said to "zo'e noi mo cu co'e"= ;=C2=A0

Both solutions seem reasonabl= e. Maybe take option b). and treat a discourse split between several people= as one sentence with special FUhE .. FUhO markers?

mi viska lo pendo FUhE [B asks] be ma [FUhO]
mi viska lo pendo = FUhE [B asks] noi mo [FUhO]

A: - I see a frien= d.
B: - Of whom?

A: - I see a friend.
B: - Who does what?

This would reformulate= fragments as parts of discourse so that we can remove them from the gramma= r. Of course, this would require somehow preparing existing texts by markin= g them with those FUhE ... FUhO so that we can parse them.

It depends on how you define "= text". Is a dialogue one text, or a succession of texts? The usual tak= e is that it's a succession of texts, since otherwise a lot of lojban d= ialogues that seem to parse would not parse. For example the irc logs=C2=A0=

I also allowed relative clauses in s= umti without their heads. If fragments are removed from the grammar then si= milar things can be useful:

sumti_4 =3D expr:= (sumti_5 / relative_clauses / gek sumti gik sumti_4) {return _node(&= quot;sumti_4", expr);}
=
This could be dangerous, as it makes "ta prenu poi do s= isku" grammatical, but not with the expected meaning. Also things lika= {da poi prenu ku'o noi melbi".
=C2=A0
This results in {fa noi pendo mi cu melbi} (in fact it may even make = {be} useless except when used stylistically).

I think it's safer to require a "lo&quo= t; for bare relative clauses: "lo noi pendo cu melbi" (this was a= lso discussed as a good alternative to "poi'i" in many cases)= .
=C2=A0
=
At= some point there was talk of making the selbri of a sumti-tail elidable as= well, so that "lo ku" would be a valid sumti.
=

I almost never use {ku} in this sense (LE-terminator). Besid= es, some people think that {cu} should mark the beginning of a bridi tail. = In this case I don't understand how to treat {lo cu broda}. Should it b= e {lo COhE KU cu broda} or {lo cu broda KU CU COhE} ?

Since a bridi-tail is not part of a sumt= i-tail, it can only be the first.
=C2=A0
mu'o mi= 9;e xorxes

--
You received this message because you are subscribed to the Google Groups &= quot;BPFK" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to bpfk-list= +unsubscribe@googlegroups.com.
To post to this group, send email to bpfk-list@googlegroups.com.
Visit this group at ht= tp://groups.google.com/group/bpfk-list.
For more options, visit http= s://groups.google.com/d/optout.
--047d7bb0410215b19205124c8d54--