From nessus@free.fr Fri Dec 06 00:57:52 2002 Received: with ECARTIS (v1.0.0; list lojban-list); Fri, 06 Dec 2002 00:57:52 -0800 (PST) Received: from smtp-out-2.wanadoo.fr ([193.252.19.254] helo=mel-rto2.wanadoo.fr) by digitalkingdom.org with esmtp (Exim 4.05) id 18KEIu-0007ku-00 for lojban-list@lojban.org; Fri, 06 Dec 2002 00:57:48 -0800 Received: from mel-rta7.wanadoo.fr (193.252.19.61) by mel-rto2.wanadoo.fr (6.7.010) id 3DEF189A000BACEC for lojban-list@lojban.org; Fri, 6 Dec 2002 09:57:17 +0100 Received: from tanj (80.9.199.52) by mel-rta7.wanadoo.fr (6.7.010) id 3DEDFF890011A80E for lojban-list@lojban.org; Fri, 6 Dec 2002 09:57:17 +0100 Message-ID: <005f01c29d05$7b3e6840$34c70950@tanj> From: "Lionel Vidal" To: References: <02120414202304.01986@neofelis> <5.1.0.14.0.20021205200740.00ac9740@pop.east.cox.net> Subject: [lojban] Re: cmegadri valfendi preti Date: Fri, 6 Dec 2002 09:56:44 +0100 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 X-archive-position: 3127 X-ecartis-version: Ecartis v1.0.0 Sender: lojban-list-bounce@lojban.org Errors-to: lojban-list-bounce@lojban.org X-original-sender: nessus@free.fr Precedence: bulk Reply-to: lojban-list@lojban.org X-list: lojban-list Nora LeChevalier: > I used a backward algorithm because a forward algorithm is susceptible to > garden-pathing. For example, > 'miKLAmaleZARcifuleKARcegi'eBEVrileDAKlis' > is a name, but you don't know it until the final letter. I am not sure I really understand the expression 'garden-pathing', but I do think your example illustrates my point :-) Suppose you hear it in conditions good enough to identify clearly each sound and stress: even then you just can't wait for the final 's' to start the parsing, memorizing all exactly along the way, and seeing with great relief the final 's', which means you won't have to go through it again from the beginning! What I meant by forward parsing, is one forward pass, taking along the set of remaining possibilities, which is actually very near IMO of the process followed by humans to parse any language. Humans are not very good at backtracking over meaningless sounds, but are much better to do so in a limited way over meaningfull entities. In your example, some steps of the evolution of the parsing set could be something like: (* stands for anything, and the rule is that as soon as the set is a singleton, the parse to that point is done, but you still must wait till the end of the whole chunk parsing to set the validity flag) {mi} : (mi *) or * {miKLAmaleZAR} : (mi klama (le * or *)) or cmene {miKLAmaleZARcifu}: {mi klama le zarci (fu * or *)} or cmene and so on until the final 's', where only the cmene option is left. Note that if along the parse I would have met 'Vla', the cmene option would have gone away and I would have cut the chunk to that point and problably spot an error like 'missing pause before cmene' after the already parsed part. I admit I have not yet completely worked out this algorithm (and one part of it, namely giving a semantic to the parse errrors, is actually rather tricky), but I think it could be easier to use and more importantly easier to be proven correct than the current one. -- Lionel