From jjllambias@gmail.com Sat Jan 23 06:14:16 2010 Received: from mail-bw0-f215.google.com ([209.85.218.215]) by chain.digitalkingdom.org with esmtp (Exim 4.71) (envelope-from ) id 1NYgke-0007vt-NB for lojban-list@lojban.org; Sat, 23 Jan 2010 06:14:15 -0800 Received: by bwz7 with SMTP id 7so1731395bwz.26 for ; Sat, 23 Jan 2010 06:14:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=cX21mMFOlUPADXBzVqrPkneMArDBJiJdVo4pJ1h42ic=; b=SWc8vPr6m8VcI6HA9M/AN93wbSjFMn419BjPRjtHiVSBWvpzGlizFnkTa8OFvUW6eN Vhhd6rJrmbtcIVBuW7xAdks+9hAEkEf2jVRYq4EMGzq0To3+BliI2zDTScTVlU1ptnBR 701dxzoam9ZBB0hsDKVPxXmKJb/eL2X8g0TGI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=SIjItGZI0tn7aEBzDX2dnFsZYCqPCUpmmKSPPG7tTZ9ZhaYKVekIw6/dnwLBrfzXLp EPFo5UQ26RWYvfB8IQYoHFsOBGb3n/DBfdzU+Dujx8hj+iK4YR4mm0optuZgM5qlQ0gg jY7Um/8PL/amFRyUw+2flMeDCvlEXBL43wyfs= MIME-Version: 1.0 Received: by 10.204.8.140 with SMTP id h12mr2403931bkh.77.1264256045998; Sat, 23 Jan 2010 06:14:05 -0800 (PST) In-Reply-To: <201001222242.21483.phma@phma.optus.nu> References: <201001222242.21483.phma@phma.optus.nu> Date: Sat, 23 Jan 2010 11:14:05 -0300 Message-ID: <925d17561001230614x4505f61u5c566397d2dd57d0@mail.gmail.com> Subject: Re: [lojban] camxes's reaction to some fu'ivla From: =?ISO-8859-1?Q?Jorge_Llamb=EDas?= To: lojban-list@lojban.org Content-Type: text/plain; charset=ISO-8859-1 On Sat, Jan 23, 2010 at 12:42 AM, Pierre Abbat wrote: > I installed camxes (the compiled jar) and ran it on some words. I gave it all > the months of the Roman calendar (ianvari, frebuari, martio, prilio, madjio, > djunio, djulio, avgusto, septembero, oktobero, novmbero, decmbero; kuintili > (=djulio), sektili (=avgusto), mercedonio (noi setca ke'a lo frebuari)), and > it accepted them all. Vlatai rejects zo martio bi'o zo djulio. Right. camxes accepts the CS syllable onset (where S stands for the semivowels i, u), although there has never been an official decision one way or the other about this. Why does vlatai reject them? I thought vlatai was generally more permissive than camxes about these things. > "soiombo" (a Mongolian symbol) and "peuence" (the Chilean people of the > monkey-puzzle tree) it rejects. It should read them as "so iombo", "pe uence". Basically, what camxes does is treat the semivowels i/u as onsets and thus any word that starts with them doesn't require a ".". > These would be valid if the rule were "The > second consonant must be next to another consonant when y'y and ybu are > ignored", but with the current rule, "There must be a consonant cluster in > the first five letters, ignoring y'y and ybu", they're not. However, it > considers "soiombo" to break into "so iombo". This I think is incorrect. A > word beginning with a vowel must be preceded by a pause, else you can't > tell "so iombo" from "soi ombo". Both programs accept "ricrpeuence". "soi.ombo" requires a glottal stop. "soiombo" is unambiguously "so iombo". The rule about "the first five letters" is a bit nonsensical from a phonological point of view. There is no sensible reason to reject "ba'auski" if ".a'auski" is accepted. The camxes rule basically is: if it consists of Lojban non-y syllables with penultimate stress, it ends in an open syllable, and the initial part doesn't fall off as a cmavo or slinku'i consonant, then it's a valid fu'ivla. (Also, the first syllable can't be a consonantal syllable.) > camxes rejects "aierne". Its attempt at a parse is simply "a" (".a .ierne" is > not the start of anything grammatical, as "a" is expecting a sumti to > follow), even though ".ai .erne" is grammatical. It's grammatical for example in "zo .a ierne", or "lo'u .a ierne le'u". In any case, the lexer doesn't care about syntax, it just breaks a stream of phonemes into words. Whether those words end up forming a grammatical utterance or not is not a concern for the lexer. > camxes accepts "amliau" and not "mliau"; Yes, because "am" and "liau" are valid syllables. camxes doesn't like CCS as syllable onset. > jbofi'e accepts "mliau" as a word and > splits "amliau" into a misparse. Whichever one is valid means "meow". (Although the phonological issues that still have to be officially settled are relatively few and marginal, the BPFK should make a decision about them at some point, it probably doesn't look very good for Lojban that we still haven't made up our mind about them.) mu'o mi'e xorxes