Received: from mail-ew0-f61.google.com ([209.85.215.61]) by chain.digitalkingdom.org with esmtp (Exim 4.72) (envelope-from ) id 1QCZ8J-00034h-7F; Wed, 20 Apr 2011 08:16:04 -0700 Received: by ewy5 with SMTP id 5sf88954ewy.16 for ; Wed, 20 Apr 2011 08:15:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=beta; h=domainkey-signature:x-beenthere:received-spf:date:from:to:subject :message-id:mail-followup-to:references:mime-version:in-reply-to :x-original-sender:x-original-authentication-results:reply-to :precedence:mailing-list:list-id:x-google-group-id:list-post :list-help:list-archive:sender:list-subscribe:list-unsubscribe :content-type:content-disposition; bh=7Tp9Uf6oaDVGxUuXHN+xw2a8Ph4Wr1crWevbtEiCHFo=; b=VPF7jP/vlzvm8QPvFXbpxeY2OGfh4o1gzdR7s+E+23rAJ+oBQqcwv7z4pLmJD64TfW sdSfwmafR1eSvUEwImdduvcl5/RPUkJHz+Aa4vJ6+ovoyptNP0tnC4yZlMROC0UX7h8v XOoKEJ07gA2aiLx60JVqEFV/Lq5+8vrjKGiDQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlegroups.com; s=beta; h=x-beenthere:received-spf:date:from:to:subject:message-id :mail-followup-to:references:mime-version:in-reply-to :x-original-sender:x-original-authentication-results:reply-to :precedence:mailing-list:list-id:x-google-group-id:list-post :list-help:list-archive:sender:list-subscribe:list-unsubscribe :content-type:content-disposition; b=6nZ0chevk6J8rdO9K2cFLeFYeVWziEu+4nI7j8vEEamPux/ILhby91KuriNKBNVQx6 Edk8Y8Z9HRHw0iqOPxetz+XR6KKuQk1T8wlbDkVPyeYZYNvqbMig/PV2t03OrCPpl7bI 3TE9Vq7rrQwF/T5msp/9tcLhkN7ncLIPEgcBE= Received: by 10.213.2.200 with SMTP id 8mr561199ebk.1.1303312344539; Wed, 20 Apr 2011 08:12:24 -0700 (PDT) X-BeenThere: lojban-beginners@googlegroups.com Received: by 10.213.99.195 with SMTP id v3ls340772ebn.3.gmail; Wed, 20 Apr 2011 08:12:23 -0700 (PDT) Received: by 10.216.189.82 with SMTP id b60mr557849wen.8.1303312343192; Wed, 20 Apr 2011 08:12:23 -0700 (PDT) Received: by 10.216.189.82 with SMTP id b60mr557848wen.8.1303312343173; Wed, 20 Apr 2011 08:12:23 -0700 (PDT) Received: from mail-gy0-f176.google.com (mail-gy0-f176.google.com [209.85.160.176]) by gmr-mx.google.com with ESMTPS id n57si141965wer.1.2011.04.20.08.12.21 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 20 Apr 2011 08:12:21 -0700 (PDT) Received-SPF: neutral (google.com: 209.85.160.176 is neither permitted nor denied by best guess record for domain of alanpost@sunflowerriver.org) client-ip=209.85.160.176; Received: by gya6 with SMTP id 6so324534gya.7 for ; Wed, 20 Apr 2011 08:12:20 -0700 (PDT) Received: by 10.150.252.16 with SMTP id z16mr3568443ybh.345.1303312340430; Wed, 20 Apr 2011 08:12:20 -0700 (PDT) Received: from sunflowerriver.org (173-10-243-253-Albuquerque.hfc.comcastbusiness.net [173.10.243.253]) by mx.google.com with ESMTPS id r18sm694326yba.11.2011.04.20.08.12.17 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 20 Apr 2011 08:12:19 -0700 (PDT) Date: Wed, 20 Apr 2011 09:12:14 -0600 From: ".alyn.post." To: lojban-beginners@googlegroups.com Subject: Re: [lojban-beginners] vlastezba: First beta version released! Message-ID: <20110420151214.GC49678@alice.local> Mail-Followup-To: lojban-beginners@googlegroups.com References: <20110420142911.GB49678@alice.local> Mime-Version: 1.0 In-Reply-To: X-Original-Sender: alyn.post@lodockikumazvati.org X-Original-Authentication-Results: gmr-mx.google.com; spf=neutral (google.com: 209.85.160.176 is neither permitted nor denied by best guess record for domain of alanpost@sunflowerriver.org) smtp.mail=alanpost@sunflowerriver.org Reply-To: lojban-beginners@googlegroups.com Precedence: list Mailing-list: list lojban-beginners@googlegroups.com; contact lojban-beginners+owners@googlegroups.com List-ID: X-Google-Group-Id: 300742228892 List-Post: , List-Help: , List-Archive: Sender: lojban-beginners@googlegroups.com List-Subscribe: , List-Unsubscribe: , Content-Type: text/plain; charset=ISO-8859-1 Content-Disposition: inline Content-Length: 7169 I can more-or-less work with the what it does now, so that is sufficient experimentation. I routinely write code like |if(var=="foo")| when I mean |if(var.equals("foo"))|, my Java isn't what it could be. I'm able to parse XML for tree-structured data, which is probably the easiest choice for interoperability: XML: coi ro do coi ro do If this makes you cringe, then how about: csv: klesi,valsi COI,coi PA,ro KOhA,do Which unfortunately doesn't include the input string; I don't see a simple way to do that that is normal (as in normal form). -Alan On Wed, Apr 20, 2011 at 04:51:51PM +0200, Johan Pretorius wrote: > Hi Alan, > > That would indeed be an interesting experiment, I'd be quite keen to see > the results myself. > > Right now, if you just call > > java -jar vlastezba.jar test.txt > > with some Lojban text (legal or otherwise) in test.txt, it will return (on > stdout), one valsi per line. So "coirodo" would result in: > coi > ro > do > (you can make it go look up the definitions by passing a second parameter, > but it will just add junk to the output that I don't think you'd want) > > Right now it doesn't check grammar at all, so you can throw any random > collection of words at it (I don't intend for it to ever do this, there > are tools out there that are far better at this than I could ever hope to > make it). > > It also won't give you a classification of valsi - it doesn't "know" when > it's dealing with a cmavo (or indeed what class), or a gismu, or a lujvo. > This I DO intend to fix. > > I want to add other output formats anyway, so if you want me to do > something specific to make your comparison easier, let me know. Now would > be a good time, as I'm going away on holiday for a week, and wanted to > spend at least a little bit of time on vlastezba. > > In fact, if you are comfortable with Java, feel free to make it do what > you need, the source code is on [1]sourceforge.net > ([2]http://sourceforge.net/projects/vlastezba/), and is GPL'ed :-) > > mu'o mi'e iu'an > > On Wed, Apr 20, 2011 at 4:29 PM, .alyn.post. > <[3]alyn.post@lodockikumazvati.org> wrote: > > Do you have an external representation for your valsi parsing > result? If I hand you the string "coirodo" is there a print > form of that along the lines of ("coi" "ro" "do")? > > I would be interested seeing the result from processing a large > data set of words and phrases and comparing that to jbogenturfa'i. > In order to do this I'd need some output format from your program > that I could parse. > > jbogenturfa'i uses the morphology PEG grammar that xorxes developed, > so it contains code which I think is similar (and should be > identical in result) to what you are doing: > > $ echo "coirodo"|jbogenturfahi --rafske > ((cmavo (COI "coi")) (cmavo (PA "ro")) (cmavo (KOhA "do"))) > > I'd be curious to know whether they are in fact producing identical > results. > > -Alan > On Wed, Apr 20, 2011 at 11:02:28AM +0200, Johan Pretorius wrote: > > Hi all > > > > You can download it from here: > > > [1][4]http://sourceforge.net/projects/vlastezba/files/vlastezba.jar/download > > > > I have completed the cmavo cluster breakout code, and tested it as far > as > > I was able. > > > > It should be easy enough to run if you have Java 1.6 installed, just > go > > java -jar vlastezba.jar and it will print out usage instructions. > > > > Please download it and test to pieces! I'd love all your feedback. > > > > Not that it doesn't get very smart at this stage - for instance, it > won't > > know what to do if you feed it a string of lojban that doesn't have > any > > spaces in. The only clever bit is that it's able to break apart cmavo > > clusters if they don't have any spaces. > > > > Regards, > > Johan > > > > -- > > Johan Pretorius > > Cell: 0829268327 > > [2][5]pretoriusjf@gmail.com > > > > -- > > You received this message because you are subscribed to the Google > Groups > > "Lojban Beginners" group. > > To post to this group, send email to > [6]lojban-beginners@googlegroups.com. > > To unsubscribe from this group, send email to > > [7]lojban-beginners+unsubscribe@googlegroups.com. > > For more options, visit this group at > > [8]http://groups.google.com/group/lojban-beginners?hl=en. > > > > References > > > > Visible links > > 1. > [9]http://sourceforge.net/projects/vlastezba/files/vlastezba.jar/download > > 2. mailto:[10]pretoriusjf@gmail.com > > -- > .i ma'a lo bradi ku penmi gi'e du > -- > You received this message because you are subscribed to the Google > Groups "Lojban Beginners" group. > To post to this group, send email to > [11]lojban-beginners@googlegroups.com. > To unsubscribe from this group, send email to > [12]lojban-beginners+unsubscribe@googlegroups.com. > For more options, visit this group at > [13]http://groups.google.com/group/lojban-beginners?hl=en. > > -- > Johan Pretorius > Cell: 0829268327 > [14]pretoriusjf@gmail.com > > -- > You received this message because you are subscribed to the Google Groups > "Lojban Beginners" group. > To post to this group, send email to lojban-beginners@googlegroups.com. > To unsubscribe from this group, send email to > lojban-beginners+unsubscribe@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/lojban-beginners?hl=en. > > References > > Visible links > 1. http://sourceforge.net/ > 2. http://sourceforge.net/projects/vlastezba/ > 3. mailto:alyn.post@lodockikumazvati.org > 4. http://sourceforge.net/projects/vlastezba/files/vlastezba.jar/download > 5. mailto:pretoriusjf@gmail.com > 6. mailto:lojban-beginners@googlegroups.com > 7. mailto:lojban-beginners%2Bunsubscribe@googlegroups.com > 8. http://groups.google.com/group/lojban-beginners?hl=en > 9. http://sourceforge.net/projects/vlastezba/files/vlastezba.jar/download > 10. mailto:pretoriusjf@gmail.com > 11. mailto:lojban-beginners@googlegroups.com > 12. mailto:lojban-beginners%2Bunsubscribe@googlegroups.com > 13. http://groups.google.com/group/lojban-beginners?hl=en > 14. mailto:pretoriusjf@gmail.com -- .i ma'a lo bradi ku penmi gi'e du -- You received this message because you are subscribed to the Google Groups "Lojban Beginners" group. To post to this group, send email to lojban-beginners@googlegroups.com. To unsubscribe from this group, send email to lojban-beginners+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/lojban-beginners?hl=en.