From phma@webjockey.net Mon Jan 06 06:49:22 2003
Return-Path: <lojban-out@lojban.org>
X-Sender: lojban-out@lojban.org
X-Apparently-To: lojban@yahoogroups.com
Received: (EGP: mail-8_2_3_0); 6 Jan 2003 14:49:22 -0000
Received: (qmail 99905 invoked from network); 6 Jan 2003 14:49:22 -0000
Received: from unknown (66.218.66.217)
  by m5.grp.scd.yahoo.com with QMQP; 6 Jan 2003 14:49:22 -0000
Received: from unknown (HELO digitalkingdom.org) (204.152.186.175)
  by mta2.grp.scd.yahoo.com with SMTP; 6 Jan 2003 14:49:22 -0000
Received: from lojban-out by digitalkingdom.org with local (Exim 4.05)
  id 18VYZ8-0004fK-00
  for lojban@yahoogroups.com; Mon, 06 Jan 2003 06:49:22 -0800
Received: from digitalkingdom.org ([204.152.186.175] helo=chain)
  by digitalkingdom.org with esmtp (Exim 4.05)
  id 18VYYV-0004c9-00; Mon, 06 Jan 2003 06:48:43 -0800
Received: with ECARTIS (v1.0.0; list lojban-list); Mon, 06 Jan 2003 06:48:42 -0800 (PST)
Received: from 208-150-110-21-adsl.precisionet.net ([208.150.110.21] helo=blackcat.ixazon.lan)
  by digitalkingdom.org with esmtp (Exim 4.05)
  id 18VYYI-0004Z8-00
  for lojban-list@lojban.org; Mon, 06 Jan 2003 06:48:30 -0800
Received: by blackcat.ixazon.lan (Postfix, from userid 1001)
  id 2599224DB; Mon, 6 Jan 2003 14:48:04 +0000 (UTC)
Organization: dis
To: "'lojban-list@lojban.org'" <lojban-list@lojban.org>
Subject: [lojban] Bug in word break algorithm
Date: Mon, 6 Jan 2003 09:48:03 -0500
User-Agent: KMail/1.5
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
Message-Id: <200301060948.03611.phma@webjockey.net>
X-archive-position: 3712
X-ecartis-version: Ecartis v1.0.0
Sender: lojban-list-bounce@lojban.org
Errors-to: lojban-list-bounce@lojban.org
X-original-sender: phma@webjockey.net
Precedence: bulk
X-list: lojban-list
From: Pierre Abbat <phma@webjockey.net>
Reply-To: phma@webjockey.net
X-Yahoo-Group-Post: member; u=92712300

3] If the piece we have left starts with a vowel, find the first
consonant. If the first consonant is part of a consonant cluster
(only CC-form this time), and this consonant cluster is NOT a valid
initial cluster (with each adjacent pair of consonants is a valid
initial pair), then we can resolve the entire piece as a le'avla
(e.g. /antipAsto/); otherwise (if the first consonant is NOT part of
a consonant cluster, or the consonant cluster IS a valid initial
cluster), break off before the first consonant as a cmavo (e.g.
/a'ofArlu/ becomes /a'o/ = cmavo + /fArlu/ = unresolved; or,
/aismAcu/ becomes /ai/ = cmavo + /smAcu/ = unresolved).

This gives the wrong answer if the part after the vowel is a slinku'i, for 
example /esKRIma/. How can I recognize a slinku'i by the front-middle method 
or something similar?

phma




