From nessus@free.fr Fri Dec 06 00:58:31 2002
Return-Path: <lojban-out@lojban.org>
X-Sender: lojban-out@lojban.org
X-Apparently-To: lojban@yahoogroups.com
Received: (EGP: mail-8_2_3_0); 6 Dec 2002 08:58:31 -0000
Received: (qmail 15175 invoked from network); 6 Dec 2002 08:58:30 -0000
Received: from unknown (66.218.66.217)
  by m3.grp.scd.yahoo.com with QMQP; 6 Dec 2002 08:58:30 -0000
Received: from unknown (HELO digitalkingdom.org) (204.152.186.175)
  by mta2.grp.scd.yahoo.com with SMTP; 6 Dec 2002 08:58:30 -0000
Received: from lojban-out by digitalkingdom.org with local (Exim 4.05)
  id 18KEJa-0007lQ-00
  for lojban@yahoogroups.com; Fri, 06 Dec 2002 00:58:30 -0800
Received: from digitalkingdom.org ([204.152.186.175] helo=chain)
  by digitalkingdom.org with esmtp (Exim 4.05)
  id 18KEIz-0007l3-00; Fri, 06 Dec 2002 00:57:53 -0800
Received: with ECARTIS (v1.0.0; list lojban-list); Fri, 06 Dec 2002 00:57:52 -0800 (PST)
Received: from smtp-out-2.wanadoo.fr ([193.252.19.254] helo=mel-rto2.wanadoo.fr)
  by digitalkingdom.org with esmtp (Exim 4.05)
  id 18KEIu-0007ku-00
  for lojban-list@lojban.org; Fri, 06 Dec 2002 00:57:48 -0800
Received: from mel-rta7.wanadoo.fr (193.252.19.61) by mel-rto2.wanadoo.fr (6.7.010)
  id 3DEF189A000BACEC for lojban-list@lojban.org; Fri, 6 Dec 2002 09:57:17 +0100
Received: from tanj (80.9.199.52) by mel-rta7.wanadoo.fr (6.7.010)
  id 3DEDFF890011A80E for lojban-list@lojban.org; Fri, 6 Dec 2002 09:57:17 +0100
Message-ID: <005f01c29d05$7b3e6840$34c70950@tanj>
To: <lojban-list@lojban.org>
References: <02120414202304.01986@neofelis> <5.1.0.14.0.20021205200740.00ac9740@pop.east.cox.net>
Subject: [lojban] Re: cmegadri valfendi preti
Date: Fri, 6 Dec 2002 09:56:44 +0100
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2800.1106
X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2800.1106
X-archive-position: 3127
X-ecartis-version: Ecartis v1.0.0
Sender: lojban-list-bounce@lojban.org
Errors-to: lojban-list-bounce@lojban.org
X-original-sender: nessus@free.fr
Precedence: bulk
X-list: lojban-list
From: "Lionel Vidal" <nessus@free.fr>
Reply-To: nessus@free.fr
X-Yahoo-Group-Post: member; u=47678341
X-Yahoo-Profile: cmacinf

Nora LeChevalier:
> I used a backward algorithm because a forward algorithm is susceptible to
> garden-pathing. For example,
> 'miKLAmaleZARcifuleKARcegi'eBEVrileDAKlis'
> is a name, but you don't know it until the final letter.

I am not sure I really understand the expression 'garden-pathing', but
I do think your example illustrates my point :-)
Suppose you hear it in conditions good enough to identify clearly each
sound and stress: even then you just can't wait for the final 's' to start
the parsing, memorizing all exactly along the way, and seeing with great
relief the final 's', which means you won't have to go through it again
from the beginning!

What I meant by forward parsing, is one forward pass, taking along the
set of remaining possibilities, which is actually very near IMO of the
process followed by humans to parse any language. Humans are not
very good at backtracking over meaningless sounds, but are much
better to do so in a limited way over meaningfull entities.
In your example, some steps of the evolution of the parsing set could be
something like: (* stands for anything, and the rule is that as soon as the
set is a singleton, the parse to that point is done, but you still must wait
till the end of the whole chunk parsing to set the validity flag)

{mi} : (mi *) or *
{miKLAmaleZAR} : (mi klama (le * or *)) or cmene
{miKLAmaleZARcifu}: {mi klama le zarci (fu * or *)} or cmene
and so on until the final 's', where only the cmene option is left.

Note that if along the parse I would have met 'Vla', the cmene option
would have gone away and I would have cut the chunk to that point
and problably spot an error like 'missing pause before cmene' after
the already parsed part.
I admit I have not yet completely worked out this algorithm (and one
part of it, namely giving a semantic to the parse errrors, is actually
rather tricky), but I think it could be easier to use and more
importantly easier to be proven correct than the current one.

-- Lionel






