From nobody@digitalkingdom.org Mon Jun 16 16:10:53 2008 Received: with ECARTIS (v1.0.0; list lojban-list); Mon, 16 Jun 2008 16:10:54 -0700 (PDT) Received: from nobody by chain.digitalkingdom.org with local (Exim 4.69) (envelope-from ) id 1K8Nqf-0006qU-QB for lojban-list-real@lojban.org; Mon, 16 Jun 2008 16:10:53 -0700 Received: from wf-out-1314.google.com ([209.85.200.174]) by chain.digitalkingdom.org with esmtp (Exim 4.69) (envelope-from ) id 1K8NqZ-0006q0-NX for lojban-list@lojban.org; Mon, 16 Jun 2008 16:10:53 -0700 Received: by wf-out-1314.google.com with SMTP id 23so5582351wfg.25 for ; Mon, 16 Jun 2008 16:10:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references; bh=K1b2qrrME5pzTLkI8SCLiKIbClX91kOfvX5DDZjIJuU=; b=HPVRzokEfChaeb07XFtMKwI1RSNJGrIT152kk8CXTk79Wbry3xwOHuIAvqng7g5URc 8OPSC+xQfUFdB4EO0HbqAlzrRoV+kWYpXPW7FusLQsMzqhtMm4Kf3t4mJ5Sy4umwyHOr qpBYevx08Ili4SiqOjhQsChsHXvM6jG3WSZ/4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=VfTLSGZx21Ldwqb5OPppZ1sOEcYVKyItFO5P0eTkVjShKM7NKD+QJCFSrOG4RtkB7q C9Qai/2OE/ijptDMjuWpfeO8Kby2SvUGb/a889B+Cp1mxFospZ5BJmn07GMfmRUaVk2b RmkJNDN5W1/mogRGopPHLeO2lywwVLEiluIW8= Received: by 10.142.142.14 with SMTP id p14mr2589591wfd.315.1213657841453; Mon, 16 Jun 2008 16:10:41 -0700 (PDT) Received: by 10.142.50.21 with HTTP; Mon, 16 Jun 2008 16:10:41 -0700 (PDT) Message-ID: <737b61f30806161610q69ae0539kf5627d78dac3d6b8@mail.gmail.com> Date: Mon, 16 Jun 2008 18:10:41 -0500 From: "Chris Capel" To: lojban-list@lojban.org Subject: [lojban] Re: PEG grammar issues In-Reply-To: <925d17560806160934x6ebb01fayca3ddaddfca5c401@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by Ecartis Content-Disposition: inline References: <737b61f30806151939g53bbd8a1s3480b51573d433a1@mail.gmail.com> <925d17560806160934x6ebb01fayca3ddaddfca5c401@mail.gmail.com> X-Spam-Score: -0.0 X-Spam-Score-Int: 0 X-Spam-Bar: / X-archive-position: 14505 X-ecartis-version: Ecartis v1.0.0 Sender: lojban-list-bounce@lojban.org Errors-to: lojban-list-bounce@lojban.org X-original-sender: pdf23ds@gmail.com Precedence: bulk Reply-to: lojban-list@lojban.org X-list: lojban-list On Mon, Jun 16, 2008 at 11:34 AM, Jorge Llambías wrote: > On Sun, Jun 15, 2008 at 11:39 PM, Chris Capel wrote: >> >> First, the top-level production should fail if it can't parse the >> whole string. Currently 'text' ends with an EOF?, which makes it never >> fail. > > I think that was on purpose: parse as much as you can parse, and > discard anything unparsable that follows. Sure, but I think that both behaviors are needed in different contexts. But it doesn't really matter much--individual parsers will do what they will. >> Second, selbri-3 should parse its child selbri-4 into left-associative >> groups. > > The same applies to statement-1, bridi-tail-1 and sumti-2, right? Don't think so, maybe, and maybe, except that in the last two cases, the parse tree actually shows them as right-associative, which would make it harder to fix. But I'm not terribly clear on the grammar (in the wider sense) here. No to 'statement-1' because I don't think statement-2's really have associativity, so the correct parse tree would be flat, and the current parse tree is flat, so it's not broken. Same thing for bridi-tail-1 and sumti-2--do those really have associativity? Does it matter which order you interpret the giheks or jeks? >> Third, tenses that probably ought to be parsed as part of the bridi >> are currently being parsed as head terms, because of the term-1 >> production: >> I'm not exactly sure how this one needs to be >> fixed, but what about this: >> >> term-1 <- sumti / term-2 / termset / NA-clause KU-clause free* >> >> term-2 <- !gek (tag (sumti / KU-clause free*) / FA-clause free* >> (sumti / KU-clause? free*) ) > > That makes it impossible to omit {ku} in other positions as well. > For example, {mi ka'e pu klama} would fail. > > How about "!gek !selbri" instead of just "!gek" in the original rule? Sounds good to me. >> Fourth, 'term-sa' only appears to match one term sa under some >> conditions. For instance, it doesn't match this: >> >> mi ba klama lo sa lo sa do >> >> which one might imagine could be said by someone with a stutter. >> Here's one possible fix: >> >> term-sa <- term-start (!term-1 (sa-word / SA-clause !term-1) )* >> SA-clause &term-1 > > SA ought to be ditched or completely reformulated, IMHO. Well, my suggestion's there for posterity's sake. Chris Capel -- "What is it like to be a bat? What is it like to bat a bee? What is it like to be a bee being batted? What is it like to be a batted bee?" -- The Mind's I (Hofstadter, Dennet) To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.