From nobody@digitalkingdom.org Mon Aug 24 18:07:56 2009 Received: with ECARTIS (v1.0.0; list lojban-beginners); Mon, 24 Aug 2009 18:07:56 -0700 (PDT) Received: from nobody by chain.digitalkingdom.org with local (Exim 4.69) (envelope-from ) id 1MfkVv-0004K8-U3 for lojban-beginners-real@lojban.org; Mon, 24 Aug 2009 18:07:56 -0700 Received: from ol.freeshell.org ([192.94.73.20] helo=sdf.lonestar.org) by chain.digitalkingdom.org with esmtp (Exim 4.69) (envelope-from ) id 1MfkVt-0004Jz-Gp for lojban-beginners@lojban.org; Mon, 24 Aug 2009 18:07:55 -0700 Received: from sdf.lonestar.org (IDENT:jwodder@iceland.freeshell.org [192.94.73.5]) by sdf.lonestar.org (8.14.3/8.14.3) with ESMTP id n7P17nCr025268 for ; Tue, 25 Aug 2009 01:07:49 GMT Received: (from jwodder@localhost) by sdf.lonestar.org (8.14.3/8.12.8/Submit) id n7P17mrx011687 for lojban-beginners@lojban.org; Tue, 25 Aug 2009 01:07:49 GMT Date: Tue, 25 Aug 2009 01:07:48 +0000 From: Minimiscience To: lojban-beginners@lojban.org Subject: [lojban-beginners] Re: Distinguishing Type 3 and 4 fu'ivla Message-ID: <20090825010744.GA16044@sdf.lonestar.org> Mail-Followup-To: lojban-beginners@lojban.org References: <95b2fb130908241456p6d89ee60y325e4e316fb58299@mail.gmail.com> <20090824222155.GA26713@sdf.lonestar.org> <95b2fb130908241637h54f6288ci457ae336f9c7a41c@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <95b2fb130908241637h54f6288ci457ae336f9c7a41c@mail.gmail.com> Organization: SDF Public Access UNIX System User-Agent: Mutt/1.5.19 (2009-01-05) X-archive-position: 2102 X-ecartis-version: Ecartis v1.0.0 Sender: lojban-beginners-bounce@lojban.org Errors-to: lojban-beginners-bounce@lojban.org X-original-sender: minimiscience@gmail.com Precedence: bulk Reply-to: lojban-beginners@lojban.org X-list: lojban-beginners de'i li 24 pi'e 08 pi'e 2009 la'o fy. H. Felton .fy. cusku zoi skamyxatra. > I'm still working on my learning program to help me with memorization, > but I also thinking about a program that could _morphologically_ > identify "words" in Lojban text. If I see "lenubrivla", the program > could identify it as a multiple word, and split it in to "le nu > brivla" -- _this_ program would not attempt to identify meaning. > Given what you just said it seems that the catagories will have to be: > "multiple word" (needs to be split), "cmavo", "gismu", "lujvo", "Type 4 > fu'ivla", "possible Type 3 fu'ivla". .skamyxatra Why would {fu'ivla} be split into two separate types? The division between types 3 and 4 is a convention, and discriminating between them is only relevant for semantic purposes, not morphological (i.e., identifying a type 3 {fu'ivla} lets you determine the general concept it describes). Also, your {vlalei} list omits {cmene}/{cmevla}. > this also means that the program will identify "becrbai" as a lujvo (assuming > I haven't make a mistake in generating a lujvo from those two rafsi) You did make a mistake; that 'r' should be a 'y'. 'R' is only allowed as a hyphen in {lujvo} when the first {rafsi} is CVV and either the {tanru} contains more than two words or the second {rafsi} is not CCV. mu'omi'e .kamymecraijun. -- do ta'e pilno le va valsi .i mi na jinvi lodu'u ri valsi da poi do jinvi lodu'u ri valsi ke'a