From lojban-beginners+bncCNzDnoiEDxCw9bvtBBoEIlG8IA@googlegroups.com Wed Apr 20 08:24:50 2011 Received: from mail-fx0-f61.google.com ([209.85.161.61]) by chain.digitalkingdom.org with esmtp (Exim 4.72) (envelope-from ) id 1QCZGm-0004U2-0m; Wed, 20 Apr 2011 08:24:49 -0700 Received: by fxm14 with SMTP id 14sf857671fxm.16 for ; Wed, 20 Apr 2011 08:24:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=beta; h=domainkey-signature:x-beenthere:received-spf:mime-version :in-reply-to:references:date:message-id:subject:from:to :x-original-sender:x-original-authentication-results:reply-to :precedence:mailing-list:list-id:x-google-group-id:list-post :list-help:list-archive:sender:list-subscribe:list-unsubscribe :content-type; bh=W27tessN6ZnGZM3m32OAD5O3j+2ESEI2VHRaglQHkz4=; b=4dClS8qQSfdxv45Wo+kYwRSEWKYBn88nyKlMg4NkLfYup89H2QUoTDXWdXylfNnHkq CoUaRwE7HJDGJrOuFTlGvk2EVdlvAAef0RFh3YuriOR61n5Ax1AvljXWko83hAqAqHEK Qa4pJJmuOI/uulAE8tSgRpTTG2Boi5qiht89k= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlegroups.com; s=beta; h=x-beenthere:received-spf:mime-version:in-reply-to:references:date :message-id:subject:from:to:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:x-google-group-id:list-post:list-help:list-archive:sender :list-subscribe:list-unsubscribe:content-type; b=ANSCXorsDglPLFQkXgYdF8O09tErqTLBaD2Vn+xT1YoTJLLMse5HkEtV0OUvrJLZvb exUBF4OwS8TA7hrpM3nCExFRNgCv49RvHTy4sFbfAZzZ5Pimf2XMwqtAMm0kTJQy6O2R Iop9o8ODoYCvz8lawXthIRFIJHTjXIU5n7S/0= Received: by 10.223.27.153 with SMTP id i25mr363203fac.8.1303313072322; Wed, 20 Apr 2011 08:24:32 -0700 (PDT) X-BeenThere: lojban-beginners@googlegroups.com Received: by 10.223.22.74 with SMTP id m10ls585252fab.2.gmail; Wed, 20 Apr 2011 08:24:31 -0700 (PDT) Received: by 10.223.87.13 with SMTP id u13mr113424fal.2.1303313071467; Wed, 20 Apr 2011 08:24:31 -0700 (PDT) Received: by 10.223.87.13 with SMTP id u13mr113423fal.2.1303313071404; Wed, 20 Apr 2011 08:24:31 -0700 (PDT) Received: from mail-fx0-f50.google.com (mail-fx0-f50.google.com [209.85.161.50]) by gmr-mx.google.com with ESMTPS id s8si121882fau.1.2011.04.20.08.24.31 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 20 Apr 2011 08:24:31 -0700 (PDT) Received-SPF: pass (google.com: domain of pretoriusjf@gmail.com designates 209.85.161.50 as permitted sender) client-ip=209.85.161.50; Received: by fxm16 with SMTP id 16so637916fxm.37 for ; Wed, 20 Apr 2011 08:24:31 -0700 (PDT) MIME-Version: 1.0 Received: by 10.223.127.197 with SMTP id h5mr341220fas.36.1303313071213; Wed, 20 Apr 2011 08:24:31 -0700 (PDT) Received: by 10.223.87.1 with HTTP; Wed, 20 Apr 2011 08:24:31 -0700 (PDT) In-Reply-To: <20110420151214.GC49678@alice.local> References: <20110420142911.GB49678@alice.local> <20110420151214.GC49678@alice.local> Date: Wed, 20 Apr 2011 17:24:31 +0200 Message-ID: Subject: Re: [lojban-beginners] vlastezba: First beta version released! From: Johan Pretorius To: lojban-beginners@googlegroups.com X-Original-Sender: pretoriusjf@gmail.com X-Original-Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of pretoriusjf@gmail.com designates 209.85.161.50 as permitted sender) smtp.mail=pretoriusjf@gmail.com; dkim=pass (test mode) header.i=@gmail.com Reply-To: lojban-beginners@googlegroups.com Precedence: list Mailing-list: list lojban-beginners@googlegroups.com; contact lojban-beginners+owners@googlegroups.com List-ID: X-Google-Group-Id: 300742228892 List-Post: , List-Help: , List-Archive: Sender: lojban-beginners@googlegroups.com List-Subscribe: , List-Unsubscribe: , Content-Type: multipart/alternative; boundary=0023545309c8a40f5304a15b38fe --0023545309c8a40f5304a15b38fe Content-Type: text/plain; charset=ISO-8859-1 Okay, the licensing is fixed now. Alan, The fact that you know that's a problem puts you in 1% of the population :-) Anyway, I'm not diametrically opposed to XML just for the sake of being opposed... it's worth looking at, especially, as you say, for interoperability. Do you think it's necessary to include the input string? I foresee vlastezba being used for large bodies of text, anyway that's how I intend to use it for myself: I feed it the terry the tiger story and let it build me something I can print out, which means my sucky vocabulary does not stop me reading the story, albeit slowly. Maybe it's a good idea to make that configurable. -Johan On Wed, Apr 20, 2011 at 5:12 PM, .alyn.post. wrote: > I can more-or-less work with the what it does now, so that is > sufficient experimentation. > > I routinely write code like |if(var=="foo")| when I mean > |if(var.equals("foo"))|, my Java isn't what it could be. > > I'm able to parse XML for tree-structured data, which is probably > the easiest choice for interoperability: > > XML: > > > coi ro do > > coi > ro > do > > > > If this makes you cringe, then how about: > > csv: > > klesi,valsi > COI,coi > PA,ro > KOhA,do > > Which unfortunately doesn't include the input string; I don't see a > simple way to do that that is normal (as in normal form). > > -Alan > > On Wed, Apr 20, 2011 at 04:51:51PM +0200, Johan Pretorius wrote: > > Hi Alan, > > > > That would indeed be an interesting experiment, I'd be quite keen to > see > > the results myself. > > > > Right now, if you just call > > > > java -jar vlastezba.jar test.txt > > > > with some Lojban text (legal or otherwise) in test.txt, it will return > (on > > stdout), one valsi per line. So "coirodo" would result in: > > coi > > ro > > do > > (you can make it go look up the definitions by passing a second > parameter, > > but it will just add junk to the output that I don't think you'd want) > > > > Right now it doesn't check grammar at all, so you can throw any random > > collection of words at it (I don't intend for it to ever do this, > there > > are tools out there that are far better at this than I could ever hope > to > > make it). > > > > It also won't give you a classification of valsi - it doesn't "know" > when > > it's dealing with a cmavo (or indeed what class), or a gismu, or a > lujvo. > > This I DO intend to fix. > > > > I want to add other output formats anyway, so if you want me to do > > something specific to make your comparison easier, let me know. Now > would > > be a good time, as I'm going away on holiday for a week, and wanted to > > spend at least a little bit of time on vlastezba. > > > > In fact, if you are comfortable with Java, feel free to make it do > what > > you need, the source code is on [1]sourceforge.net > > ([2]http://sourceforge.net/projects/vlastezba/), and is GPL'ed :-) > > > > mu'o mi'e iu'an > > > > On Wed, Apr 20, 2011 at 4:29 PM, .alyn.post. > > <[3]alyn.post@lodockikumazvati.org> wrote: > > > > Do you have an external representation for your valsi parsing > > result? If I hand you the string "coirodo" is there a print > > form of that along the lines of ("coi" "ro" "do")? > > > > I would be interested seeing the result from processing a large > > data set of words and phrases and comparing that to jbogenturfa'i. > > In order to do this I'd need some output format from your program > > that I could parse. > > > > jbogenturfa'i uses the morphology PEG grammar that xorxes developed, > > so it contains code which I think is similar (and should be > > identical in result) to what you are doing: > > > > $ echo "coirodo"|jbogenturfahi --rafske > > ((cmavo (COI "coi")) (cmavo (PA "ro")) (cmavo (KOhA "do"))) > > > > I'd be curious to know whether they are in fact producing identical > > results. > > > > -Alan > > On Wed, Apr 20, 2011 at 11:02:28AM +0200, Johan Pretorius wrote: > > > Hi all > > > > > > You can download it from here: > > > > > [1][4] > http://sourceforge.net/projects/vlastezba/files/vlastezba.jar/download > > > > > > I have completed the cmavo cluster breakout code, and tested it as > far > > as > > > I was able. > > > > > > It should be easy enough to run if you have Java 1.6 installed, > just > > go > > > java -jar vlastezba.jar and it will print out usage instructions. > > > > > > Please download it and test to pieces! I'd love all your feedback. > > > > > > Not that it doesn't get very smart at this stage - for instance, > it > > won't > > > know what to do if you feed it a string of lojban that doesn't > have > > any > > > spaces in. The only clever bit is that it's able to break apart > cmavo > > > clusters if they don't have any spaces. > > > > > > Regards, > > > Johan > > > > > > -- > > > Johan Pretorius > > > Cell: 0829268327 > > > [2][5]pretoriusjf@gmail.com > > > > > > -- > > > You received this message because you are subscribed to the Google > > Groups > > > "Lojban Beginners" group. > > > To post to this group, send email to > > [6]lojban-beginners@googlegroups.com. > > > To unsubscribe from this group, send email to > > > [7]lojban-beginners+unsubscribe@googlegroups.com. > > > For more options, visit this group at > > > [8]http://groups.google.com/group/lojban-beginners?hl=en. > > > > > > References > > > > > > Visible links > > > 1. > > [9] > http://sourceforge.net/projects/vlastezba/files/vlastezba.jar/download > > > 2. mailto:[10]pretoriusjf@gmail.com > > > > -- > > .i ma'a lo bradi ku penmi gi'e du > > -- > > You received this message because you are subscribed to the Google > > Groups "Lojban Beginners" group. > > To post to this group, send email to > > [11]lojban-beginners@googlegroups.com. > > To unsubscribe from this group, send email to > > [12]lojban-beginners+unsubscribe@googlegroups.com. > > For more options, visit this group at > > [13]http://groups.google.com/group/lojban-beginners?hl=en. > > > > -- > > Johan Pretorius > > Cell: 0829268327 > > [14]pretoriusjf@gmail.com > > > > -- > > You received this message because you are subscribed to the Google > Groups > > "Lojban Beginners" group. > > To post to this group, send email to > lojban-beginners@googlegroups.com. > > To unsubscribe from this group, send email to > > lojban-beginners+unsubscribe@googlegroups.com. > > For more options, visit this group at > > http://groups.google.com/group/lojban-beginners?hl=en. > > > > References > > > > Visible links > > 1. http://sourceforge.net/ > > 2. http://sourceforge.net/projects/vlastezba/ > > 3. mailto:alyn.post@lodockikumazvati.org > > 4. > http://sourceforge.net/projects/vlastezba/files/vlastezba.jar/download > > 5. mailto:pretoriusjf@gmail.com > > 6. mailto:lojban-beginners@googlegroups.com > > 7. mailto:lojban-beginners%2Bunsubscribe@googlegroups.com > > 8. http://groups.google.com/group/lojban-beginners?hl=en > > 9. > http://sourceforge.net/projects/vlastezba/files/vlastezba.jar/download > > 10. mailto:pretoriusjf@gmail.com > > 11. mailto:lojban-beginners@googlegroups.com > > 12. mailto:lojban-beginners%2Bunsubscribe@googlegroups.com > > 13. http://groups.google.com/group/lojban-beginners?hl=en > > 14. mailto:pretoriusjf@gmail.com > > -- > .i ma'a lo bradi ku penmi gi'e du > > -- > You received this message because you are subscribed to the Google Groups > "Lojban Beginners" group. > To post to this group, send email to lojban-beginners@googlegroups.com. > To unsubscribe from this group, send email to > lojban-beginners+unsubscribe@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/lojban-beginners?hl=en. > > -- Johan Pretorius Cell: 0829268327 pretoriusjf@gmail.com -- You received this message because you are subscribed to the Google Groups "Lojban Beginners" group. To post to this group, send email to lojban-beginners@googlegroups.com. To unsubscribe from this group, send email to lojban-beginners+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/lojban-beginners?hl=en. --0023545309c8a40f5304a15b38fe Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Okay, the licensing is fixed now.

Alan, The fact that you know that&= #39;s a problem puts you in 1% of the population :-)

Anyway, I'm= not diametrically opposed to XML just for the sake of being opposed... it&= #39;s worth looking at, especially, as you say, for interoperability.

Do you think it's necessary to include the input string?=A0 I fores= ee vlastezba being used for large bodies of text, anyway that's how I i= ntend to use it for myself: I feed it the terry the tiger story and let it = build me something I can print out, which means my sucky vocabulary does no= t stop me reading the story, albeit slowly.

Maybe it's a good idea to make that configurable.

-Johan
=

On Wed, Apr 20, 2011 at 5:12 PM, .alyn.p= ost. <alyn.post@lodockikumazvati.org> wrote:
I can more-or-less work with the what it do= es now, so that is
sufficient experimentation.

I routinely write code like |if(var=3D=3D"foo")| when I mean
|if(var.equals("foo"))|, my Java isn't what it could be.

I'm able to parse XML for tree-structured data, which is probably
the easiest choice for interoperability:

XML:

=A0<pruce>
=A0 =A0<selruhe>coi ro do</selruhe>
=A0 =A0<teryruhe>
=A0 =A0 =A0<cmavo selmaho=3D"COI">coi</cmavo>
=A0 =A0 =A0<cmavo selmaho=3D"PA">ro</cmavo>
=A0 =A0 =A0<cmavo selmaho=3D"KOhA">do</cmavo>
=A0 =A0</teryruhe>
=A0</pruce>

If this makes you cringe, then how about:

csv:

=A0klesi,valsi
=A0COI,coi
=A0PA,ro
=A0KOhA,do

Which unfortunately doesn't include the input string; I don't see a=
simple way to do that that is normal (as in normal form).

-Alan

On Wed, Apr 20, 2011 at 04:51:51PM +0200, Johan Pretorius wrote:
> =A0 =A0Hi Alan,
>
> =A0 =A0That would indeed be an interesting experiment, I'd be quit= e keen to see
> =A0 =A0the results myself.
>
> =A0 =A0Right now, if you just call
>
> =A0 =A0java -jar vlastezba.jar test.txt
>
> =A0 =A0with some Lojban text (legal or otherwise) in test.txt, it will= return (on
> =A0 =A0stdout), one valsi per line. So "coirodo" would resul= t in:
> =A0 =A0coi
> =A0 =A0ro
> =A0 =A0do
> =A0 =A0(you can make it go look up the definitions by passing a second= parameter,
> =A0 =A0but it will just add junk to the output that I don't think = you'd want)
>
> =A0 =A0Right now it doesn't check grammar at all, so you can throw= any random
> =A0 =A0collection of words at it (I don't intend for it to ever do= this, there
> =A0 =A0are tools out there that are far better at this than I could ev= er hope to
> =A0 =A0make it).
>
> =A0 =A0It also won't give you a classification of valsi - it doesn= 't "know" when
> =A0 =A0it's dealing with a cmavo (or indeed what class), or a gism= u, or a lujvo.
> =A0 =A0This I DO intend to fix.
>
> =A0 =A0I want to add other output formats anyway, so if you want me to= do
> =A0 =A0something specific to make your comparison easier, let me know.= Now would
> =A0 =A0be a good time, as I'm going away on holiday for a week, an= d wanted to
> =A0 =A0spend at least a little bit of time on vlastezba.
>
> =A0 =A0In fact, if you are comfortable with Java, feel free to make it= do what
> =A0 =A0you need, the source code is on [1]sourceforge.net
> =A0 =A0([2]http://sourceforge.net/projects/vlastezba/), and is GPL&#= 39;ed :-)
>
> =A0 =A0mu'o mi'e iu'an
>
> =A0 =A0On Wed, Apr 20, 2011 at 4:29 PM, .alyn.post.
> =A0 =A0<[3]alyn.post@lodockikumazvati.org> wrote:
>
> =A0 =A0 =A0Do you have an external representation for your valsi parsi= ng
> =A0 =A0 =A0result? If I hand you the string "coirodo" is the= re a print
> =A0 =A0 =A0form of that along the lines of ("coi" "ro&q= uot; "do")?
>
> =A0 =A0 =A0I would be interested seeing the result from processing a l= arge
> =A0 =A0 =A0data set of words and phrases and comparing that to jbogent= urfa'i.
> =A0 =A0 =A0In order to do this I'd need some output format from yo= ur program
> =A0 =A0 =A0that I could parse.
>
> =A0 =A0 =A0jbogenturfa'i uses the morphology PEG grammar that xorx= es developed,
> =A0 =A0 =A0so it contains code which I think is similar (and should be=
> =A0 =A0 =A0identical in result) to what you are doing:
>
> =A0 =A0 =A0$ echo "coirodo"|jbogenturfahi --rafske
> =A0 =A0 =A0((cmavo (COI "coi")) (cmavo (PA "ro")) = (cmavo (KOhA "do")))
>
> =A0 =A0 =A0I'd be curious to know whether they are in fact produci= ng identical
> =A0 =A0 =A0results.
>
> =A0 =A0 =A0-Alan
> =A0 =A0 =A0On Wed, Apr 20, 2011 at 11:02:28AM +0200, Johan Pretorius w= rote:
> =A0 =A0 =A0> Hi all
> =A0 =A0 =A0>
> =A0 =A0 =A0> You can download it from here:
> =A0 =A0 =A0>
> =A0 =A0 =A0[1][4]http://sourceforge.ne= t/projects/vlastezba/files/vlastezba.jar/download
> =A0 =A0 =A0>
> =A0 =A0 =A0> I have completed the cmavo cluster breakout code, and = tested it as far
> =A0 =A0 =A0as
> =A0 =A0 =A0> I was able.
> =A0 =A0 =A0>
> =A0 =A0 =A0> It should be easy enough to run if you have Java 1.6 i= nstalled, just
> =A0 =A0 =A0go
> =A0 =A0 =A0> java -jar vlastezba.jar and it will print out usage in= structions.
> =A0 =A0 =A0>
> =A0 =A0 =A0> Please download it and test to pieces! I'd love al= l your feedback.
> =A0 =A0 =A0>
> =A0 =A0 =A0> Not that it doesn't get very smart at this stage -= for instance, it
> =A0 =A0 =A0won't
> =A0 =A0 =A0> know what to do if you feed it a string of lojban that= doesn't have
> =A0 =A0 =A0any
> =A0 =A0 =A0> spaces in. The only clever bit is that it's able t= o break apart cmavo
> =A0 =A0 =A0> clusters if they don't have any spaces.
> =A0 =A0 =A0>
> =A0 =A0 =A0> Regards,
> =A0 =A0 =A0> Johan
> =A0 =A0 =A0>
> =A0 =A0 =A0> --
> =A0 =A0 =A0> Johan Pretorius
> =A0 =A0 =A0> Cell: 0829268327
> =A0 =A0 =A0> [2][5]p= retoriusjf@gmail.com
> =A0 =A0 =A0>
> =A0 =A0 =A0> --
> =A0 =A0 =A0> You received this message because you are subscribed t= o the Google
> =A0 =A0 =A0Groups
> =A0 =A0 =A0> "Lojban Beginners" group.
> =A0 =A0 =A0> To post to this group, send email to
> =A0 =A0 =A0[6]lojban-beginners@googlegroups.com.
> =A0 =A0 =A0> To unsubscribe from this group, send= email to
> =A0 =A0 =A0> [7]lojban-beginners+unsubscribe@googlegroups.com.
> =A0 =A0 =A0> For more options, visit this group a= t
> =A0 =A0 =A0> [8]http://groups.google.com/group/loj= ban-beginners?hl=3Den.
> =A0 =A0 =A0>
> =A0 =A0 =A0> References
> =A0 =A0 =A0>
> =A0 =A0 =A0> Visible links
> =A0 =A0 =A0> 1.
> =A0 =A0 =A0[9]http://sourceforge.net/project= s/vlastezba/files/vlastezba.jar/download
> =A0 =A0 =A0> 2. mailto:[10]pretoriusjf@gmail.com
>
> =A0 =A0 =A0--
> =A0 =A0 =A0.i ma'a lo bradi ku penmi gi'e du
> =A0 =A0 =A0--
> =A0 =A0 =A0You received this message because you are subscribed to the= Google
> =A0 =A0 =A0Groups "Lojban Beginners" group.
> =A0 =A0 =A0To post to this group, send email to
> =A0 =A0 =A0[11]lojban-beginners@googlegroups.com.
> =A0 =A0 =A0To unsubscribe from this group, send emai= l to
> =A0 =A0 =A0[12]lojban-beginners+unsubscribe@googlegroups.com.
> =A0 =A0 =A0For more options, visit this group at
> =A0 =A0 =A0[13]http://groups.google.com/group/lojban-= beginners?hl=3Den.
>
> =A0 =A0--
> =A0 =A0Johan Pretorius
> =A0 =A0Cell: 0829268327
> =A0 =A0[14]pretoriusjf@gmail.= com
>
> =A0 =A0--
> =A0 =A0You received this message because you are subscribed to the Goo= gle Groups
> =A0 =A0"Lojban Beginners" group.
> =A0 =A0To post to this group, send email to lojban-beginners@googlegroups.com.
> =A0 =A0To unsubscribe from this group, send email to
> =A0 =A0lojban-beginners+unsubscribe@googlegroups.com.
> =A0 =A0For more options, visit this group at
> =A0 =A0http://groups.google.com/group/lojban-beginners?hl= =3Den.
>
> References
>
> =A0 =A0Visible links
> =A0 =A01. http:/= /sourceforge.net/
> =A0 =A02. http://sourceforge.net/projects/vlastezba/
> =A0 =A03. mailto:aly= n.post@lodockikumazvati.org
> =A0 =A04. http://sourceforge.net/projects/vl= astezba/files/vlastezba.jar/download
> =A0 =A05. mailto:pretoriusjf@= gmail.com
> =A0 =A06. mailto:= lojban-beginners@googlegroups.com
> =A0 =A07. mailto:lojban-beginners%2Bunsubscribe@googlegroups.com
> =A0 =A08. http://groups.google.com/group/lojban-beginners?h= l=3Den
> =A0 =A09. http://sourceforge.net/projects/vl= astezba/files/vlastezba.jar/download
> =A0 10. mailto:pretoriusjf@gm= ail.com
> =A0 11. mailto:lo= jban-beginners@googlegroups.com
> =A0 12. mailto:lojban-beginners%2Bunsubscribe@googlegroups.com
> =A0 13. http://groups.google.com/group/lojban-beginners?hl= =3Den
> =A0 14. mailto:pretoriusjf@gm= ail.com

--
.i ma'a lo bradi ku penmi gi'e du

--
You received this message because you are subscribed to the Google Groups &= quot;Lojban Beginners" group.
To post to this group, send email to lojban-beginners@googlegroups.com.
To unsubscribe from this group, send email to lojban-beginners+unsubscribe@googlegr= oups.com.
For more options, visit this group at http://groups.google.com/g= roup/lojban-beginners?hl=3Den.




--
Johan Preto= rius
Cell: 0829268327

--
You received this message because you are subscribed to the Google Groups "= Lojban Beginners" group.
To post to this group, send email to lojban-beginners@googlegroups.com.
To unsubscribe from this group, send email to lojban-beginners+unsubscribe@= googlegroups.com.
For more options, visit this group at http://groups.google.com/group/lojban= -beginners?hl=3Den.
--0023545309c8a40f5304a15b38fe--