Received: from mail-fx0-f61.google.com ([209.85.161.61]) by chain.digitalkingdom.org with esmtp (Exim 4.72) (envelope-from ) id 1QCYlC-0003LQ-4B; Wed, 20 Apr 2011 07:52:13 -0700 Received: by fxm14 with SMTP id 14sf831268fxm.16 for ; Wed, 20 Apr 2011 07:51:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=beta; h=domainkey-signature:x-beenthere:received-spf:mime-version :in-reply-to:references:date:message-id:subject:from:to :x-original-sender:x-original-authentication-results:reply-to :precedence:mailing-list:list-id:x-google-group-id:list-post :list-help:list-archive:sender:list-subscribe:list-unsubscribe :content-type; bh=/v6Jdx/DuBi2RRvPVLRv43r8tvJwIxdZxCgF0tblPnw=; b=Ex69a+k+hOlHL9u4Yux7zdZEHhpYB3lva3g3weByMypsZF6KViZgIgzlGoryjXBfwv xFpeb3Eeo6u4r+mNNqeACkihb2qiSavfK6q7+93ez4KIOLIMrX8I8pQ1RRKHhhvQQfbB RgW+0QjItrOqXZyAz4uAzIm1SLJ7P+ZRcJZ7w= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlegroups.com; s=beta; h=x-beenthere:received-spf:mime-version:in-reply-to:references:date :message-id:subject:from:to:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:x-google-group-id:list-post:list-help:list-archive:sender :list-subscribe:list-unsubscribe:content-type; b=dyt9kHfCnHMMtZtWklCKd9HExeUFl7URryaqQ0nIgfPzqtkw29rfkdhY68bxXiNBA9 oftbDR3coPozWxvsIFPz9hqXh7R5Bm97Cw6Kj53uFznFVHIqLMQA/BYrhGmqHo94MVCA kZG6UY717wVPyiKZJItK6tY+kkSCyAYKeHgOs= Received: by 10.223.127.194 with SMTP id h2mr532132fas.9.1303311114876; Wed, 20 Apr 2011 07:51:54 -0700 (PDT) X-BeenThere: lojban-beginners@googlegroups.com Received: by 10.223.51.195 with SMTP id e3ls575594fag.1.gmail; Wed, 20 Apr 2011 07:51:53 -0700 (PDT) Received: by 10.223.64.138 with SMTP id e10mr111096fai.8.1303311113753; Wed, 20 Apr 2011 07:51:53 -0700 (PDT) Received: by 10.223.64.138 with SMTP id e10mr111095fai.8.1303311113716; Wed, 20 Apr 2011 07:51:53 -0700 (PDT) Received: from mail-fx0-f52.google.com (mail-fx0-f52.google.com [209.85.161.52]) by gmr-mx.google.com with ESMTPS id 20si103466fav.0.2011.04.20.07.51.53 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 20 Apr 2011 07:51:53 -0700 (PDT) Received-SPF: pass (google.com: domain of pretoriusjf@gmail.com designates 209.85.161.52 as permitted sender) client-ip=209.85.161.52; Received: by fxm6 with SMTP id 6so450729fxm.25 for ; Wed, 20 Apr 2011 07:51:53 -0700 (PDT) MIME-Version: 1.0 Received: by 10.223.85.195 with SMTP id p3mr2171989fal.0.1303311111503; Wed, 20 Apr 2011 07:51:51 -0700 (PDT) Received: by 10.223.87.1 with HTTP; Wed, 20 Apr 2011 07:51:51 -0700 (PDT) In-Reply-To: <20110420142911.GB49678@alice.local> References: <20110420142911.GB49678@alice.local> Date: Wed, 20 Apr 2011 16:51:51 +0200 Message-ID: Subject: Re: [lojban-beginners] vlastezba: First beta version released! From: Johan Pretorius To: lojban-beginners@googlegroups.com X-Original-Sender: pretoriusjf@gmail.com X-Original-Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of pretoriusjf@gmail.com designates 209.85.161.52 as permitted sender) smtp.mail=pretoriusjf@gmail.com; dkim=pass (test mode) header.i=@gmail.com Reply-To: lojban-beginners@googlegroups.com Precedence: list Mailing-list: list lojban-beginners@googlegroups.com; contact lojban-beginners+owners@googlegroups.com List-ID: X-Google-Group-Id: 300742228892 List-Post: , List-Help: , List-Archive: Sender: lojban-beginners@googlegroups.com List-Subscribe: , List-Unsubscribe: , Content-Type: multipart/alternative; boundary=20cf3054a697d5459b04a15ac3d9 Content-Length: 11639 --20cf3054a697d5459b04a15ac3d9 Content-Type: text/plain; charset=ISO-8859-1 Hi Alan, That would indeed be an interesting experiment, I'd be quite keen to see the results myself. Right now, if you just call java -jar vlastezba.jar test.txt with some Lojban text (legal or otherwise) in test.txt, it will return (on stdout), one valsi per line. So "coirodo" would result in: coi ro do (you can make it go look up the definitions by passing a second parameter, but it will just add junk to the output that I don't think you'd want) Right now it doesn't check grammar at all, so you can throw any random collection of words at it (I don't intend for it to ever do this, there are tools out there that are far better at this than I could ever hope to make it). It also won't give you a classification of valsi - it doesn't "know" when it's dealing with a cmavo (or indeed what class), or a gismu, or a lujvo. This I DO intend to fix. I want to add other output formats anyway, so if you want me to do something specific to make your comparison easier, let me know. Now would be a good time, as I'm going away on holiday for a week, and wanted to spend at least a little bit of time on vlastezba. In fact, if you are comfortable with Java, feel free to make it do what you need, the source code is on sourceforge.net ( http://sourceforge.net/projects/vlastezba/), and is GPL'ed :-) mu'o mi'e iu'an On Wed, Apr 20, 2011 at 4:29 PM, .alyn.post. wrote: > Do you have an external representation for your valsi parsing > result? If I hand you the string "coirodo" is there a print > form of that along the lines of ("coi" "ro" "do")? > > I would be interested seeing the result from processing a large > data set of words and phrases and comparing that to jbogenturfa'i. > In order to do this I'd need some output format from your program > that I could parse. > > jbogenturfa'i uses the morphology PEG grammar that xorxes developed, > so it contains code which I think is similar (and should be > identical in result) to what you are doing: > > $ echo "coirodo"|jbogenturfahi --rafske > ((cmavo (COI "coi")) (cmavo (PA "ro")) (cmavo (KOhA "do"))) > > I'd be curious to know whether they are in fact producing identical > results. > > -Alan > > On Wed, Apr 20, 2011 at 11:02:28AM +0200, Johan Pretorius wrote: > > Hi all > > > > You can download it from here: > > [1] > http://sourceforge.net/projects/vlastezba/files/vlastezba.jar/download > > > > I have completed the cmavo cluster breakout code, and tested it as far > as > > I was able. > > > > It should be easy enough to run if you have Java 1.6 installed, just > go > > java -jar vlastezba.jar and it will print out usage instructions. > > > > Please download it and test to pieces! I'd love all your feedback. > > > > Not that it doesn't get very smart at this stage - for instance, it > won't > > know what to do if you feed it a string of lojban that doesn't have > any > > spaces in. The only clever bit is that it's able to break apart cmavo > > clusters if they don't have any spaces. > > > > Regards, > > Johan > > > > -- > > Johan Pretorius > > Cell: 0829268327 > > [2]pretoriusjf@gmail.com > > > > -- > > You received this message because you are subscribed to the Google > Groups > > "Lojban Beginners" group. > > To post to this group, send email to > lojban-beginners@googlegroups.com. > > To unsubscribe from this group, send email to > > lojban-beginners+unsubscribe@googlegroups.com. > > For more options, visit this group at > > http://groups.google.com/group/lojban-beginners?hl=en. > > > > References > > > > Visible links > > 1. > http://sourceforge.net/projects/vlastezba/files/vlastezba.jar/download > > 2. mailto:pretoriusjf@gmail.com > > -- > .i ma'a lo bradi ku penmi gi'e du > > -- > You received this message because you are subscribed to the Google Groups > "Lojban Beginners" group. > To post to this group, send email to lojban-beginners@googlegroups.com. > To unsubscribe from this group, send email to > lojban-beginners+unsubscribe@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/lojban-beginners?hl=en. > > -- Johan Pretorius Cell: 0829268327 pretoriusjf@gmail.com -- You received this message because you are subscribed to the Google Groups "Lojban Beginners" group. To post to this group, send email to lojban-beginners@googlegroups.com. To unsubscribe from this group, send email to lojban-beginners+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/lojban-beginners?hl=en. --20cf3054a697d5459b04a15ac3d9 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi Alan,

That would indeed be an interesting experiment, I'd be = quite keen to see the results myself.

Right now, if you just call
=A0=A0 java -jar vlastezba.jar test.txt

with some Lojban text = (legal or otherwise) in test.txt, it will return (on stdout), one valsi pe= r line.=A0 So "coirodo" would result in:
=A0=A0 coi
=A0=A0 ro
=A0=A0 do
(you can make it go look up the def= initions by passing a second parameter, but it will just add junk to the ou= tput that I don't think you'd want)

Right now it doesn't= check grammar at all, so you can throw any random collection of words at i= t (I don't intend for it to ever do this, there are tools out there tha= t are far better at this than I could ever hope to make it).

It also won't give you a classification of valsi - it doesn't &= quot;know" when it's dealing with a cmavo (or indeed what class), = or a gismu, or a lujvo.=A0 This I DO intend to fix.

I want to add ot= her output formats anyway, so if you want me to do something specific to ma= ke your comparison easier, let me know.=A0 Now would be a good time, as I&#= 39;m going away on holiday for a week, and wanted to spend at least a littl= e bit of time on vlastezba.

In fact, if you are comfortable with Java, feel free to make it do what= you need, the source code is on sourcef= orge.net (http:/= /sourceforge.net/projects/vlastezba/), and is GPL'ed :-)

mu'o mi'e iu'an



O= n Wed, Apr 20, 2011 at 4:29 PM, .alyn.post. <alyn.post@lodockikumazvati.org> wrote:
Do you have an external representation for = your valsi parsing
result? =A0If I hand you the string "coirodo" is there a print form of that along the lines of ("coi" "ro" "do&qu= ot;)?

I would be interested seeing the result from processing a large
data set of words and phrases and comparing that to jbogenturfa'i.
In order to do this I'd need some output format from your program
that I could parse.

jbogenturfa'i uses the morphology PEG grammar that xorxes developed, so it contains code which I think is similar (and should be
identical in result) to what you are doing:

=A0$ echo "coirodo"|jbogenturfahi --rafske
=A0((cmavo (COI "coi")) (cmavo (PA "ro")) (cmavo (KOhA= "do")))

I'd be curious to know whether they are in fact producing identical
results.

-Alan

On Wed, Apr 20, 2011 at 11:02:28AM +0200, Johan Pretorius wrote:
> =A0 =A0Hi all
>
> =A0 =A0You can download it from here:
> =A0 =A0[1]
http://sourceforge.net/proje= cts/vlastezba/files/vlastezba.jar/download
>
> =A0 =A0I have completed the cmavo cluster breakout code, and tested it= as far as
> =A0 =A0I was able.
>
> =A0 =A0It should be easy enough to run if you have Java 1.6 installed,= just go
> =A0 =A0java -jar vlastezba.jar and it will print out usage instruction= s.
>
> =A0 =A0Please download it and test to pieces! I'd love all your fe= edback.
>
> =A0 =A0Not that it doesn't get very smart at this stage - for inst= ance, it won't
> =A0 =A0know what to do if you feed it a string of lojban that doesn= 9;t have any
> =A0 =A0spaces in. The only clever bit is that it's able to break a= part cmavo
> =A0 =A0clusters if they don't have any spaces.
>
> =A0 =A0Regards,
> =A0 =A0Johan
>
> =A0 =A0--
> =A0 =A0Johan Pretorius
> =A0 =A0Cell: 0829268327
> =A0 =A0[2]pretori= usjf@gmail.com
>
> =A0 =A0--
> =A0 =A0You received this message because you are subscribed to the Goo= gle Groups
> =A0 =A0"Lojban Beginners" group.
> =A0 =A0To post to this group, send email to lojban-beginners@googlegroups.com.
> =A0 =A0To unsubscribe from this group, send email to
> =A0 =A0lojban-beginners+unsubscribe@googlegroups.com.
> =A0 =A0For more options, visit this group at
> =A0 =A0http://groups.google.com/group/lojban-beginners?hl= =3Den.
>
> References
>
> =A0 =A0Visible links
> =A0 =A01. http://sourceforge.net/projects/vl= astezba/files/vlastezba.jar/download
> =A0 =A02. mailto:pretoriusjf@= gmail.com

--
.i ma'a lo bradi ku penmi gi'e du

--
You received this message because you are subscribed to the Google Groups &= quot;Lojban Beginners" group.
To post to this group, send email to lojban-beginners@googlegroups.com.
To unsubscribe from this group, send email to lojban-beginners+unsubscribe@googlegr= oups.com.
For more options, visit this group at http://groups.google.com/g= roup/lojban-beginners?hl=3Den.




--
Johan Pretorius<= div>Cell: 0829268327

--
You received this message because you are subscribed to the Google Groups "= Lojban Beginners" group.
To post to this group, send email to lojban-beginners@googlegroups.com.
To unsubscribe from this group, send email to lojban-beginners+unsubscribe@= googlegroups.com.
For more options, visit this group at http://groups.google.com/group/lojban= -beginners?hl=3Den.
--20cf3054a697d5459b04a15ac3d9--