[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [lojban] brevity metrics
On Thu, Apr 11, 2002 at 02:31:45PM +0100, And Rosta wrote:
> Robin Turner:
> >A lot of extra words are itty-bitty cmavo which don't add much to
> >the real length (conversely, translating English into Turkish results
> >in fewer words, but some of them can be very long!). Another point
> >is that Lojban _can_ #make distinctions explicit, and we tend to make
> >it do so because we can, but it doesn't need to do so - sometimes
> >Lojban can be amazingly terse.
>
> A good way of measuring brevity is to compare translations, e.g. by
> comparing the Lojban translation of _Alice_ with translations into
> other languages, measuring by bytes or pages. If anyone can be
> bothered to do this, I'm sure lots of us would be interested in the
> results.
Heh.
Behold, the power of unix:
rlpowell@chain> grep '^ *[.a-z]' alice-??.texinfo | wc -w
31064
rlpowell@chain> grep '@c .*[a-z]' alice-??.texinfo | wc -w
29227
That took me about 2 minutes. Woot.
Except it's slightly wrong. Again:
rlpowell@chain> grep '^ *[.a-z]' alice-??.texinfo | sed 's/^[^;]*://' | wc -w
30880
rlpowell@chain> grep '@c .*[a-z]' alice-??.texinfo | sed 's/.*:@c//' | wc -w
26505
The first one in lojban. the second is English. Is Alice actually
*finished*!?
-Robin
--
http://www.digitalkingdom.org/~rlpowell/ BTW, I'm male, honest.
le datni cu djica le nu zifre .iku'i .oi le so'e datni cu to'e te pilno
je xlali -- RLP http://www.lojban.org/