From phma@webjockey.net Wed Jan 08 16:03:13 2003
Return-Path: <lojban-out@lojban.org>
X-Sender: lojban-out@lojban.org
X-Apparently-To: lojban@yahoogroups.com
Received: (EGP: mail-8_2_3_0); 9 Jan 2003 00:03:13 -0000
Received: (qmail 92385 invoked from network); 9 Jan 2003 00:03:12 -0000
Received: from unknown (66.218.66.218)
  by m14.grp.scd.yahoo.com with QMQP; 9 Jan 2003 00:03:12 -0000
Received: from unknown (HELO digitalkingdom.org) (204.152.186.175)
  by mta3.grp.scd.yahoo.com with SMTP; 9 Jan 2003 00:03:12 -0000
Received: from lojban-out by digitalkingdom.org with local (Exim 4.05)
  id 18WQAC-0003JG-00
  for lojban@yahoogroups.com; Wed, 08 Jan 2003 16:03:12 -0800
Received: from digitalkingdom.org ([204.152.186.175] helo=chain)
  by digitalkingdom.org with esmtp (Exim 4.05)
  id 18WQA5-0003Iq-00; Wed, 08 Jan 2003 16:03:05 -0800
Received: with ECARTIS (v1.0.0; list lojban-list); Wed, 08 Jan 2003 16:03:04 -0800 (PST)
Received: from 208-150-110-21-adsl.precisionet.net ([208.150.110.21] helo=blackcat.ixazon.lan)
  by digitalkingdom.org with esmtp (Exim 4.05)
  id 18WQ9z-0003IX-00
  for lojban-list@lojban.org; Wed, 08 Jan 2003 16:02:59 -0800
Received: by blackcat.ixazon.lan (Postfix, from userid 1001)
  id 940037FA7; Thu, 9 Jan 2003 00:02:30 +0000 (UTC)
Organization: dis
To: lojban-list@lojban.org (lojban-list@lojban.org)
Subject: [lojban] Re: zoizoi
Date: Wed, 8 Jan 2003 19:02:29 -0500
User-Agent: KMail/1.5
References: <200301081430.JAA05470@mail2.reutershealth.com>
In-Reply-To: <200301081430.JAA05470@mail2.reutershealth.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Disposition: inline
Message-Id: <200301081902.30179.phma@webjockey.net>
X-archive-position: 3757
X-ecartis-version: Ecartis v1.0.0
Sender: lojban-list-bounce@lojban.org
Errors-to: lojban-list-bounce@lojban.org
X-original-sender: phma@webjockey.net
Precedence: bulk
X-list: lojban-list
From: Pierre Abbat <phma@webjockey.net>
Reply-To: phma@webjockey.net
X-Yahoo-Group-Post: member; u=92712300

On Wednesday 08 January 2003 09:19, John Cowan wrote:
> Pierre Abbat scripsit:
> > How should the word breaking program handle such strings as:
> > /zoizoi.borZOI.zoi/
>
> This one is a valid zoi-quote, although I consider it poor Lojban style
> due to the embedded "ZOI". OTOH, I quite like the use of "zoi" as a
> delimiter word, and mention it in CLL.
>
> > /zoi.FLAluKAVbu.blableblibloblu.FLAluKAVbu./
>
> This is an error. After "zoi" the delimiter is "FLAlu", but then there is
> no pause, which is required by zoi-quote syntax. The pause between "zoi"
> and "FLAlu" is ignored.

The way I'm going to do it (after I have it lexing all words other than {zoi}, 
{fa'o}, and a few other specials) is as follows:
1. Mark at the piece after {zoi} as a delimiter.
2. Search all pieces after the delimiter for something which has a beginning 
matching the delimiter, ignoring capitalization and commas.
3. Mark all pieces between the delimiters as zoi-quoted stuff.
4. Break the piece containing the ending delimiter after the delimiter, if 
there is any more to it, and mark the ending piece as a delimiter.
5. Make sure that the delimiter is a single word.

I am currently starting the brivla-end-breaking routine, which is simpler than 
the brivla-start-breaking routine. This case, or actually a slight variant of 
it, tells me that the cmavo preceding a brivla have to be broken off before 
the brivla is broken from what follows it. Consider 
/zoiFLAluKAVbu.blableblibloblu.FLAluKAVbu./. If the part after the brivla is 
broken off before or at the same time as the part before it, this will be 
/zoi/ /FLAlu/ /KAVbu/ /blableblibloblu/ /FLAluKAVbu/ when the {zoi} is 
detected, and the lexer will erroneously call {kavbu blableblibloblu} the 
quote.

phma




