[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [lojban] brevity metrics
I realize this was only a 2 minute jobbie, but remember that you
should compare translations of the text into lang X with translations
into lang Y. Comparing with the original is not a fair test.
--And.
> -----Original Message-----
> From: Robin Lee Powell [mailto:rlpowell@digitalkingdom.org]
> Sent: 14 April 2002 05:35
> To: lojban
> Subject: Re: [lojban] brevity metrics
>
>
> On Thu, Apr 11, 2002 at 02:31:45PM +0100, And Rosta wrote:
> > Robin Turner:
> > >A lot of extra words are itty-bitty cmavo which don't add much to
> > >the real length (conversely, translating English into Turkish results
> > >in fewer words, but some of them can be very long!). Another point
> > >is that Lojban _can_ #make distinctions explicit, and we tend to make
> > >it do so because we can, but it doesn't need to do so - sometimes
> > >Lojban can be amazingly terse.
> >
> > A good way of measuring brevity is to compare translations, e.g. by
> > comparing the Lojban translation of _Alice_ with translations into
> > other languages, measuring by bytes or pages. If anyone can be
> > bothered to do this, I'm sure lots of us would be interested in the
> > results.
>
> Heh.
>
> Behold, the power of unix:
>
> rlpowell@chain> grep '^ *[.a-z]' alice-??.texinfo | wc -w
> 31064
> rlpowell@chain> grep '@c .*[a-z]' alice-??.texinfo | wc -w
> 29227
>
> That took me about 2 minutes. Woot.
>
> Except it's slightly wrong. Again:
>
> rlpowell@chain> grep '^ *[.a-z]' alice-??.texinfo | sed 's/^[^;]*://' | wc -w
> 30880
> rlpowell@chain> grep '@c .*[a-z]' alice-??.texinfo | sed 's/.*:@c//' | wc -w
> 26505
>
> The first one in lojban. the second is English. Is Alice actually
> *finished*!?