Received: from mail-yx0-f189.google.com ([209.85.213.189]) by chain.digitalkingdom.org with esmtp (Exim 4.72) (envelope-from ) id 1QCZEf-0003Av-Ql; Wed, 20 Apr 2011 08:22:37 -0700 Received: by yxd5 with SMTP id 5sf1905006yxd.16 for ; Wed, 20 Apr 2011 08:22:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=beta; h=domainkey-signature:x-beenthere:received-spf:date:from:to:subject :message-id:mail-followup-to:references:mime-version:in-reply-to :x-original-sender:x-original-authentication-results:reply-to :precedence:mailing-list:list-id:x-google-group-id:list-post :list-help:list-archive:sender:list-subscribe:list-unsubscribe :content-type:content-disposition; bh=WjTBEdJQ7UD9jiNKF0n5oXyHxAwAH3gNUZZKtpnbcjw=; b=jh30yBwNjKxqxxAldvnqbctLZnTwfxED62wUF0IqQ0VszvM5+O9piAUbgbYLvK0RPu Vdjwzm1GfAp34I0mr9WUoA8mtwhg5bbQrcC5YZSDw9rilUZZ46oPMrrzXrJUGar1++1H UeyD+rcvNRqgXY+q4vBwq/BxfKB7Ii97ZWWCE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlegroups.com; s=beta; h=x-beenthere:received-spf:date:from:to:subject:message-id :mail-followup-to:references:mime-version:in-reply-to :x-original-sender:x-original-authentication-results:reply-to :precedence:mailing-list:list-id:x-google-group-id:list-post :list-help:list-archive:sender:list-subscribe:list-unsubscribe :content-type:content-disposition; b=pD4CrL0dZHGFGH4JE78Yr3fRZ1dHvlh9sum80Gt13zK/mRTbFDvBOJlD+6M1SlyfRN KR5mC4YeN36m4kWigx8nmtevdchWx0nncB8mgidUwLlbIcJIVx+Rhc/FRbJwkvLYKiab XelHj7yBlq5nnrR4nfTIKAIuSoqmQdAdwH/Qk= Received: by 10.150.173.6 with SMTP id v6mr1170078ybe.60.1303312943239; Wed, 20 Apr 2011 08:22:23 -0700 (PDT) X-BeenThere: lojban-beginners@googlegroups.com Received: by 10.150.209.6 with SMTP id h6ls1504ybg.2.gmail; Wed, 20 Apr 2011 08:22:22 -0700 (PDT) Received: by 10.146.198.3 with SMTP id v3mr331114yaf.16.1303312941923; Wed, 20 Apr 2011 08:22:21 -0700 (PDT) Received: by 10.146.198.3 with SMTP id v3mr331112yaf.16.1303312941899; Wed, 20 Apr 2011 08:22:21 -0700 (PDT) Received: from mail-gy0-f181.google.com (mail-gy0-f181.google.com [209.85.160.181]) by gmr-mx.google.com with ESMTPS id w4si1046876ybi.5.2011.04.20.08.22.20 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 20 Apr 2011 08:22:20 -0700 (PDT) Received-SPF: neutral (google.com: 209.85.160.181 is neither permitted nor denied by best guess record for domain of alanpost@sunflowerriver.org) client-ip=209.85.160.181; Received: by gyh4 with SMTP id 4so257812gyh.12 for ; Wed, 20 Apr 2011 08:22:20 -0700 (PDT) Received: by 10.236.147.38 with SMTP id s26mr4747771yhj.308.1303312940676; Wed, 20 Apr 2011 08:22:20 -0700 (PDT) Received: from sunflowerriver.org (173-10-243-253-Albuquerque.hfc.comcastbusiness.net [173.10.243.253]) by mx.google.com with ESMTPS id x68sm447616yhn.57.2011.04.20.08.22.18 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 20 Apr 2011 08:22:19 -0700 (PDT) Date: Wed, 20 Apr 2011 09:22:15 -0600 From: ".alyn.post." To: lojban-beginners@googlegroups.com Subject: Re: [lojban-beginners] vlastezba: First beta version released! Message-ID: <20110420152215.GD49678@alice.local> Mail-Followup-To: lojban-beginners@googlegroups.com References: <20110420142911.GB49678@alice.local> Mime-Version: 1.0 In-Reply-To: X-Original-Sender: alyn.post@lodockikumazvati.org X-Original-Authentication-Results: gmr-mx.google.com; spf=neutral (google.com: 209.85.160.181 is neither permitted nor denied by best guess record for domain of alanpost@sunflowerriver.org) smtp.mail=alanpost@sunflowerriver.org Reply-To: lojban-beginners@googlegroups.com Precedence: list Mailing-list: list lojban-beginners@googlegroups.com; contact lojban-beginners+owners@googlegroups.com List-ID: X-Google-Group-Id: 300742228892 List-Post: , List-Help: , List-Archive: Sender: lojban-beginners@googlegroups.com List-Subscribe: , List-Unsubscribe: , Content-Type: text/plain; charset=ISO-8859-1 Content-Disposition: inline Content-Length: 6937 I'm not getting the result you report: $ echo "coirodo"|java -jar vlastezba.jar /dev/fd/0 Read file [/dev/fd/0], got [0] unique words. This is also happening if I write the file and try it: $ cat test.txt coirodo $ java -jar vlastezba.jar test.txt Read file [test.txt], got [0] unique words. Here is my java version: $ java -version java version "1.6.0_24" Java(TM) SE Runtime Environment (build 1.6.0_24-b07-334-10M3326) Java HotSpot(TM) Client VM (build 19.1-b02-334, mixed mode) -Alan On Wed, Apr 20, 2011 at 04:51:51PM +0200, Johan Pretorius wrote: > Hi Alan, > > That would indeed be an interesting experiment, I'd be quite keen to see > the results myself. > > Right now, if you just call > > java -jar vlastezba.jar test.txt > > with some Lojban text (legal or otherwise) in test.txt, it will return (on > stdout), one valsi per line. So "coirodo" would result in: > coi > ro > do > (you can make it go look up the definitions by passing a second parameter, > but it will just add junk to the output that I don't think you'd want) > > Right now it doesn't check grammar at all, so you can throw any random > collection of words at it (I don't intend for it to ever do this, there > are tools out there that are far better at this than I could ever hope to > make it). > > It also won't give you a classification of valsi - it doesn't "know" when > it's dealing with a cmavo (or indeed what class), or a gismu, or a lujvo. > This I DO intend to fix. > > I want to add other output formats anyway, so if you want me to do > something specific to make your comparison easier, let me know. Now would > be a good time, as I'm going away on holiday for a week, and wanted to > spend at least a little bit of time on vlastezba. > > In fact, if you are comfortable with Java, feel free to make it do what > you need, the source code is on [1]sourceforge.net > ([2]http://sourceforge.net/projects/vlastezba/), and is GPL'ed :-) > > mu'o mi'e iu'an > > On Wed, Apr 20, 2011 at 4:29 PM, .alyn.post. > <[3]alyn.post@lodockikumazvati.org> wrote: > > Do you have an external representation for your valsi parsing > result? If I hand you the string "coirodo" is there a print > form of that along the lines of ("coi" "ro" "do")? > > I would be interested seeing the result from processing a large > data set of words and phrases and comparing that to jbogenturfa'i. > In order to do this I'd need some output format from your program > that I could parse. > > jbogenturfa'i uses the morphology PEG grammar that xorxes developed, > so it contains code which I think is similar (and should be > identical in result) to what you are doing: > > $ echo "coirodo"|jbogenturfahi --rafske > ((cmavo (COI "coi")) (cmavo (PA "ro")) (cmavo (KOhA "do"))) > > I'd be curious to know whether they are in fact producing identical > results. > > -Alan > On Wed, Apr 20, 2011 at 11:02:28AM +0200, Johan Pretorius wrote: > > Hi all > > > > You can download it from here: > > > [1][4]http://sourceforge.net/projects/vlastezba/files/vlastezba.jar/download > > > > I have completed the cmavo cluster breakout code, and tested it as far > as > > I was able. > > > > It should be easy enough to run if you have Java 1.6 installed, just > go > > java -jar vlastezba.jar and it will print out usage instructions. > > > > Please download it and test to pieces! I'd love all your feedback. > > > > Not that it doesn't get very smart at this stage - for instance, it > won't > > know what to do if you feed it a string of lojban that doesn't have > any > > spaces in. The only clever bit is that it's able to break apart cmavo > > clusters if they don't have any spaces. > > > > Regards, > > Johan > > > > -- > > Johan Pretorius > > Cell: 0829268327 > > [2][5]pretoriusjf@gmail.com > > > > -- > > You received this message because you are subscribed to the Google > Groups > > "Lojban Beginners" group. > > To post to this group, send email to > [6]lojban-beginners@googlegroups.com. > > To unsubscribe from this group, send email to > > [7]lojban-beginners+unsubscribe@googlegroups.com. > > For more options, visit this group at > > [8]http://groups.google.com/group/lojban-beginners?hl=en. > > > > References > > > > Visible links > > 1. > [9]http://sourceforge.net/projects/vlastezba/files/vlastezba.jar/download > > 2. mailto:[10]pretoriusjf@gmail.com > > -- > .i ma'a lo bradi ku penmi gi'e du > -- > You received this message because you are subscribed to the Google > Groups "Lojban Beginners" group. > To post to this group, send email to > [11]lojban-beginners@googlegroups.com. > To unsubscribe from this group, send email to > [12]lojban-beginners+unsubscribe@googlegroups.com. > For more options, visit this group at > [13]http://groups.google.com/group/lojban-beginners?hl=en. > > -- > Johan Pretorius > Cell: 0829268327 > [14]pretoriusjf@gmail.com > > -- > You received this message because you are subscribed to the Google Groups > "Lojban Beginners" group. > To post to this group, send email to lojban-beginners@googlegroups.com. > To unsubscribe from this group, send email to > lojban-beginners+unsubscribe@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/lojban-beginners?hl=en. > > References > > Visible links > 1. http://sourceforge.net/ > 2. http://sourceforge.net/projects/vlastezba/ > 3. mailto:alyn.post@lodockikumazvati.org > 4. http://sourceforge.net/projects/vlastezba/files/vlastezba.jar/download > 5. mailto:pretoriusjf@gmail.com > 6. mailto:lojban-beginners@googlegroups.com > 7. mailto:lojban-beginners%2Bunsubscribe@googlegroups.com > 8. http://groups.google.com/group/lojban-beginners?hl=en > 9. http://sourceforge.net/projects/vlastezba/files/vlastezba.jar/download > 10. mailto:pretoriusjf@gmail.com > 11. mailto:lojban-beginners@googlegroups.com > 12. mailto:lojban-beginners%2Bunsubscribe@googlegroups.com > 13. http://groups.google.com/group/lojban-beginners?hl=en > 14. mailto:pretoriusjf@gmail.com -- .i ma'a lo bradi ku penmi gi'e du -- You received this message because you are subscribed to the Google Groups "Lojban Beginners" group. To post to this group, send email to lojban-beginners@googlegroups.com. To unsubscribe from this group, send email to lojban-beginners+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/lojban-beginners?hl=en.