From seidensticker@msn.com Sun Mar 18 10:03:14 2001
Return-Path: <seidensticker@msn.com>
X-Sender: seidensticker@msn.com
X-Apparently-To: lojban@yahoogroups.com
Received: (EGP: mail-7_0_4); 18 Mar 2001 18:03:13 -0000
Received: (qmail 2502 invoked from network); 18 Mar 2001 18:03:12 -0000
Received: from unknown (10.1.10.26) by l10.egroups.com with QMQP; 18 Mar 2001 18:03:12 -0000
Received: from unknown (HELO mq.egroups.com) (10.1.1.36) by mta1 with SMTP; 18 Mar 2001 18:03:12 -0000
X-eGroups-Return: seidensticker@msn.com
Received: from [10.1.10.117] by mq.egroups.com with NNFMP; 18 Mar 2001 18:03:12 -0000
Date: Sun, 18 Mar 2001 18:03:08 -0000
To: lojban@yahoogroups.com
Subject: Breaking up compound cmavo
Message-ID: <992t8s+mls7@eGroups.com>
User-Agent: eGroups-EW/0.82
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Length: 541
X-Mailer: eGroups Message Poster
X-Originating-IP: 206.129.86.130
From: seidensticker@msn.com

I'm trying to figure out how to divide cmavo that have been stuck 
together. For example, consider co'omi'e. The approach I'd taken 
was to compare the word against a sorted cmavo list, increasing the 
size of the extracted token character by character until I found an 
exact match. The problem with this is that after extracting "co", 
I'd have found a match and then would try to make sense out 
of "'omi'e" -- without success. 

I'm assuming that there's a simple algorithm to parse these. Can 
someone point me to it?

Thanks.
Bob


