Received: from mail-we0-f188.google.com ([74.125.82.188]:34802) by stodi.digitalkingdom.org with esmtps (TLSv1.2:AES128-GCM-SHA256:128) (Exim 4.80.1) (envelope-from ) id 1YJ28a-0007AX-0v for lojban-list-archive@lojban.org; Wed, 04 Feb 2015 07:45:09 -0800 Received: by mail-we0-f188.google.com with SMTP id l61sf304317wev.5 for ; Wed, 04 Feb 2015 07:45:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20120806; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:sender:list-subscribe :list-unsubscribe; bh=k2W/btFHX+CXe2u+fbgs26223wlf5KLzlv3jYCxijbI=; b=lbROLWNFO+zPI0tKgFRWh46OeOnHRE9KZKiI61cWefppMuTZUQncBSbei8HDlKyV6G MWopExrkjlo7jYcbd9jRtiqHV0vS7JinyxffeuJ4XUuiSLdCG/5BpP1pzjTyz69YGHk3 1KU2VCD7Pq5YUxuE+VII5he7aibST9E693S2L3F4v/zus+H0KfXkHsw+oJWKg6LcCnXB DzJxqPUgcdYKrQK1sEGcfXIuycrMJiLQyCl2QEO7MPIqlV+ti0LpQ1TVPfPRRCOEaeAO ZAl3zp5NWspA+d7aLDWAnHBJ4JsrBgUUEE1g/iScGvn2sJ1Rwgy9ZutVXR+bv8QBJM5f HSpQ== X-Received: by 10.152.203.169 with SMTP id kr9mr357030lac.5.1423064701216; Wed, 04 Feb 2015 07:45:01 -0800 (PST) X-BeenThere: lojban@googlegroups.com Received: by 10.152.87.234 with SMTP id bb10ls53172lab.90.gmail; Wed, 04 Feb 2015 07:45:00 -0800 (PST) X-Received: by 10.152.219.136 with SMTP id po8mr3976852lac.4.1423064700405; Wed, 04 Feb 2015 07:45:00 -0800 (PST) Received: from mail-wi0-x230.google.com (mail-wi0-x230.google.com. [2a00:1450:400c:c05::230]) by gmr-mx.google.com with ESMTPS id o6si278768wia.1.2015.02.04.07.45.00 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 04 Feb 2015 07:45:00 -0800 (PST) Received-SPF: pass (google.com: domain of and.rosta@gmail.com designates 2a00:1450:400c:c05::230 as permitted sender) client-ip=2a00:1450:400c:c05::230; Received: by mail-wi0-x230.google.com with SMTP id bs8so32805086wib.3 for ; Wed, 04 Feb 2015 07:45:00 -0800 (PST) MIME-Version: 1.0 X-Received: by 10.180.88.193 with SMTP id bi1mr47250783wib.70.1423064700227; Wed, 04 Feb 2015 07:45:00 -0800 (PST) Received: by 10.27.132.70 with HTTP; Wed, 4 Feb 2015 07:45:00 -0800 (PST) In-Reply-To: References: <0CD5A578A47549238B8B046A01B8846C@gmail.com> <54BCF147.1080803@lojban.org> <54BCFC70.2010805@selpahi.de> <54BE4E4F.1060204@gmail.com> <54BEE656.9090807@gmail.com> <54BFC0F4.1010600@gmail.com> Date: Wed, 4 Feb 2015 15:45:00 +0000 Message-ID: Subject: Re: [lojban] Re: [Llg-members] nu ningau so'u se jbovlaste / updating a few jbovlaste entries From: And Rosta To: lojban@googlegroups.com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Original-Sender: and.rosta@gmail.com X-Original-Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of and.rosta@gmail.com designates 2a00:1450:400c:c05::230 as permitted sender) smtp.mail=and.rosta@gmail.com; dkim=pass header.i=@gmail.com; dmarc=pass (p=NONE dis=NONE) header.from=gmail.com Reply-To: lojban@googlegroups.com Precedence: list Mailing-list: list lojban@googlegroups.com; contact lojban+owners@googlegroups.com List-ID: X-Google-Group-Id: 1004133512417 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , X-Spam-Score: 0.8 (/) X-Spam_score: 0.8 X-Spam_score_int: 8 X-Spam_bar: / X-Spam-Report: Spam detection software, running on the system "stodi.digitalkingdom.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see @@CONTACT_ADDRESS@@ for details. Content preview: Jorge Llambías, On 21/01/2015 16:54: > > On Wed, Jan 21, 2015 at 12:08 PM, And Rosta > wrote: > > Jorge Llambías, On 21/01/2015 12:33: > > On Tue, Jan 20, 2015 at 8:35 PM, And Rosta >> wrote: > > > Step (3') yields something like Tersmu output, probably augmented by some purely syntactic (i.e. without logical import) structure. I think that can and should be done without reference to the formal grammars. > > But Tersmu output is basically FOPL, which has its own formal grammar > (on which Lojban's formal grammar is based). I still don't see what > problems formal grammars create. > > > (3') must certainly involve a grammar, and I can't think of any sense in which a grammar could meaningfully be called 'informal', so I'm happy to call that grammar 'formal'. But it differs from the CS (or at least the Lojban) notion primarily in not having phonological objects as any of its nodes and secondarily in not necessarily being simply a labelled bracketing of a string. > > > I don't understand your primary objection because the syntactic tree > generated by the Lojban formal grammars doesn't rely on its terminal > nodes being phonological objects. The terminal nodes of the syntax > part of the grammar are the selma'o. It just happens that these can > be mapped in a trivial way to the output of the morphology, but > that's not important. One could implement a completely different > morphology and mount the same Lojban syntax on that. The only > requirement for the syntax is that each syntactic word be a member of > one of the selma'o. [...] Content analysis details: (0.8 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [URIs: googlegroups.com] 2.7 DNS_FROM_AHBL_RHSBL RBL: Envelope sender listed in dnsbl.ahbl.org [listed in googlegroups.com.rhsbl.ahbl.org. IN] [A] -0.0 RCVD_IN_MSPIKE_H3 RBL: Good reputation (+3) [74.125.82.188 listed in wl.mailspike.net] 0.0 T_HEADER_FROM_DIFFERENT_DOMAINS From and EnvelopeFrom 2nd level mail domains are different -0.0 SPF_PASS SPF: sender matches SPF record 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (and.rosta[at]gmail.com) 0.0 DKIM_ADSP_CUSTOM_MED No valid author signature, adsp_override is CUSTOM_MED -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid 0.0 T_FREEMAIL_FORGED_FROMDOMAIN 2nd level domains in From and EnvelopeFrom freemail headers are different -0.0 RCVD_IN_MSPIKE_WL Mailspike good senders Jorge Llamb=C3=ADas, On 21/01/2015 16:54: > > On Wed, Jan 21, 2015 at 12:08 PM, And Rosta > wrote: > > Jorge Llamb=C3=ADas, On 21/01/2015 12:33: > > On Tue, Jan 20, 2015 at 8:35 PM, And Rosta >> wrote: > > > Step (3') yields something like Tersmu output, probably augm= ented by some purely syntactic (i.e. without logical import) structure. I t= hink that can and should be done without reference to the formal grammars. > > But Tersmu output is basically FOPL, which has its own formal gra= mmar > (on which Lojban's formal grammar is based). I still don't see wh= at > problems formal grammars create. > > > (3') must certainly involve a grammar, and I can't think of any sense= in which a grammar could meaningfully be called 'informal', so I'm happy t= o call that grammar 'formal'. But it differs from the CS (or at least the L= ojban) notion primarily in not having phonological objects as any of its no= des and secondarily in not necessarily being simply a labelled bracketing o= f a string. > > > I don't understand your primary objection because the syntactic tree > generated by the Lojban formal grammars doesn't rely on its terminal > nodes being phonological objects. The terminal nodes of the syntax > part of the grammar are the selma'o. It just happens that these can > be mapped in a trivial way to the output of the morphology, but > that's not important. One could implement a completely different > morphology and mount the same Lojban syntax on that. The only > requirement for the syntax is that each syntactic word be a member of > one of the selma'o. My primary objection is not so much the phonologicality of the terminal nodes as their nonsyntacticality: if they were syntactic then they would contain logical structure, and ellipsed elements. > The secondary objection I accept, but that's why I had (4), to > complement the labelled bracketing generated by (3). That's what > Martin's Tersmu is meant to do, because as I understand it it doesn't > start from scratch with just a string of syntactic words, it starts > from the output of (3). Well, I've already said that even tho the 'Formal Grammar' must be discarded, it can still be recycled into the actual grammar. Building the actual grammar simply by bolting together the Formal Grammar and Tersmu isn't going to resemble anything whose innards resemble human language, but at least it would be functionally equivalent to a human language syntax module. > To the extent that Lojban is a language, (3) doesn't really constitut= e any part of Lojban (despite the mistaken belief of many Lojbanists to the= contrary). Also, to the extent that Lojban is a language, there exists an = implicit version of (3'), albeit not necessarily one that is coherent or un= ambiguous. So I would recommend removing the current Formal Grammars from t= he definition of Lojban, and replacing them by one -- an explicit (3') -- t= hat more credibly represents actual human language (but is unambiguous etc.= ). > > The only problem with that is that we don't have anyone else besides > yourself competent enough to give an explicit (3'). I wouldn't even > know what (3') has to look like. We can only do what we know how to > do. Even if this is true, the goal of formulating an explicit (3') is surely one the community should have, even if unable to achieve it yet. But starting to tackle (3') is not so daunting: Step 1: What is the least clunky way of getting unambiguously from phonological words to logical form -- from the phonological words of Lojban sentences to the logical forms of Lojban sentences (with the notion of Lojban sentence defined by usage or consensus)? Any loglanger could have a stab at tackling this. Step 2: Identify any devices that are absent from natlangs. Step 3: Redo Step 1, without using devices identified in Step 2. Reflecting on this further, during the couple of weeks it's taken for me to find the time to finish this reply, I would suggest that *official*, *definitional* specification of the grammar consist only of a set of sentences defined as pairings of phonological and logical forms (ideally, consistent with the 'monoparsing' precept that to every phonological form there must correspond no more than one logical form). Then, any rule set that generates that set of pairings would be deemed to count as a valid grammar of Lojban, and then from among the valid grammars we could seek the one(s) that are closest to those internalized by human speakers. > Ok, but in Lojban there's almost a one-to-one match between > phonological and syntactic words. > > > That remains to be seen, because there isn't yet an explicit real syn= tax for Lojban. However, it's perfectly possible that in Lojban, phonology-= -syntax mismatches are rare. > > > The only mismatch I'm aware of is "ybu", which is treated as a syntactic = word even though phonologically it would break down into the hesitation "y"= and the phonological word "bu". We currently don't have a clear idea of what syntactic words Lojban has, where by "syntactic word" I mean ingredients of logicosyntactic form, the form that encodes logical structure. Some phonological words seem to correspond to chunks of logical structure rather than single nodes, and there will be instances of nodes in logical structure that don't correspond to anything in phonology (-- the most obvious example is ellipsis, which Lojban sensibly makes heavy use of). > I'm not sure if choosing a simple Lojban example is going to reveal w= hy you can't have beliefs about binary branching in natlangs. > > > What I meant to say is that I can't see a syntax as an intrinsic feature = of a natlang, as opposed to being just a model, which can be a better or wo= rse fit, but it can never be the language. Are holding for natlangs the view that I propose above for Lojban, namely that a language is a set of sentences, i.e. form--meaning correspondences, and although in practice there must be some system for generating that set, it doesn't matter what the system is, so long as it generates the right set, and therefore in that sense the system is not intrinsic to language? If Yes, I don't agree, but I think the position is coherent enough that I won't try to dissuade you from it. If not, do explain again what you mean. > So I can accept that binary branching syntaxes are more elegant, more per= spicuous, etc, I just can't believe they are a feature of the language, jus= t like the description of a house is not a feature of the house. Maybe that= 's just me not being a linguist. But could a description of an architectural plan of a house be an architectural plan of a house? Could a comprehensive explcit description of a code be a code? Surely yes, and the same for language. > . In my case I think the rules matter because (i) to understand the s= ystem you need to understand its internal mechanics, and (ii) a speaker kno= ws a certain set of rules. and it's known-rules that are my object of study= . > > > Yes, but can't those rules, or rather a part of those rules, be presented= as a CS type grammar? I understand that the Lojban formal grammars as they= are are something of a monstrosity, but what if they were cleaned up and m= ade more human compatible? You seem to be saying that the very idea of a PE= G/YACC/BNF type grammar is counter to a proper grammar, not just the partic= ular poor choices made for the Lojban grammar. I don't know how suitable PEG/YACC/BNF are for natlangs. I must ruefully confess I know nothing about PEG, despite all the work you've done with it. AFAIK linguists in the last half century haven't found BNF necessary or sufficient for their rules, but my meagre knowledge doesn't extend to knowing the mathematical properties of BNF and other actually used formalisms, and the relationships between them. In denouncing the suitability of PEG/YACC/BNF, I was really meaning to denounce treating phonological stuff (e.g. phonological words) as constituents of terminal nodes in syntactic structures. You said that terminal nodes are actually selmaho and (iirc?) that the 1--1 correspondence between phonological words and selmaho terminal nodes is not essential. So in that case my objection would not be to CS grammars per se but only to the idea that a CS grammar can model a whole grammar rather than just, say, the combinatorics of syntax. So I reserve judgement on PEG et al: if they can represent logicosyntactic structure in full, then they have my blessing. --And. --=20 You received this message because you are subscribed to the Google Groups "= lojban" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to lojban+unsubscribe@googlegroups.com. To post to this group, send email to lojban@googlegroups.com. Visit this group at http://groups.google.com/group/lojban. For more options, visit https://groups.google.com/d/optout.