From zefram@fysh.org Thu May 13 14:38:17 2004 Received: with ECARTIS (v1.0.0; list lojban-list); Thu, 13 May 2004 14:38:17 -0700 (PDT) Received: from [195.167.170.152] (helo=bowl.fysh.org ident=mail) by chain.digitalkingdom.org with esmtp (TLS-1.0:DHE_RSA_3DES_EDE_CBC_SHA:24) (Exim 4.32) id 1BONu6-0001fe-2o for lojban-list@lojban.org; Thu, 13 May 2004 14:38:10 -0700 Received: from zefram by bowl.fysh.org with local (Exim 3.35 #1 (Debian)) id 1BONu0-0003vB-00; Thu, 13 May 2004 22:38:04 +0100 Date: Thu, 13 May 2004 22:38:04 +0100 To: lojban-list@lojban.org Subject: [lojban] erasure words Message-ID: <20040513213804.GG16333@fysh.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i From: Zefram X-archive-position: 7807 X-ecartis-version: Ecartis v1.0.0 Sender: lojban-list-bounce@lojban.org Errors-to: lojban-list-bounce@lojban.org X-original-sender: zefram@fysh.org Precedence: bulk Reply-to: lojban-list@lojban.org X-list: lojban-list Looking at the "sa" debate, it appears that people have come up with more than one useful set of semantics for it: * erase up to and including the previous instance of the following word * erase up to and including the previous word of the same selma'o as the following word * erase until the next word can legally follow To which I'd like to add another possibility along the same lines: * erase up to and including the previous word that is in the same category as the following word, using broader categories than selma'o, so that "le broda sa la broda" preprocesses to "la broda" And I came to the conclusion that we've got more useful erase operators than we have words assigned to them. Perhaps some of the expanded cmavo space should be earmarked for erase operators. Btw, this earmarking is a protocol engineering technique, and I highly recommend it. If a Lojban parser sees a cmavo that it doesn't know, being able to tell at least whether it is an erase operator would be *very* helpful. Encountering an unrecognised/unimplemented erase operator throws the whole text into question, and should cause immediate complaint, whereas an unrecognised non-erase cmavo (even of unknown selma'o) is more recoverable. I also think part of the "sa" debate is happening because people are trying to define it in a very low-level way, operating on words without regard for grammar. Such low-level operators are indeed useful, but they're not sufficient for a good preprocessor. See C's preprocessor for a demonstration of this problem, and Lisp's macros as counterpoint. I'd like to have some higher-level erase operators that parse what has gone before and act on that. These would be used to correct higher-level errors: because they require grammatical text they couldn't fix grammatical errors, but would be useful when the wrong grammatical text has been said. Operators to think about: * erase the sumti currently in progress or just completed * erase the bridi currently in progress or just completed * erase back to and including the opening delimiter matched by the closing delimiter that follows the erase word This is just what seems useful to me based on a couple of weeks experience; I'd like to see the opinions of more experienced Lojbanists. -zefram