From arosta@uclan.ac.uk Sun Apr 14 11:31:14 2002
Return-Path: <a.rosta@ntlworld.com>
X-Sender: a.rosta@ntlworld.com
X-Apparently-To: lojban@yahoogroups.com
Received: (EGP: mail-8_0_3_1); 14 Apr 2002 18:31:14 -0000
Received: (qmail 33251 invoked from network); 14 Apr 2002 18:31:14 -0000
Received: from unknown (66.218.66.217)
  by m6.grp.scd.yahoo.com with QMQP; 14 Apr 2002 18:31:14 -0000
Received: from unknown (HELO mta02-svc.ntlworld.com) (62.253.162.42)
  by mta2.grp.scd.yahoo.com with SMTP; 14 Apr 2002 18:31:14 -0000
Received: from oemcomputer ([62.253.88.94]) by mta02-svc.ntlworld.com
  (InterMail vM.4.01.03.27 201-229-121-127-20010626) with SMTP
  id <20020414183111.FLTU286.mta02-svc.ntlworld.com@oemcomputer>
  for <lojban@yahoogroups.com>; Sun, 14 Apr 2002 19:31:11 +0100
To: "lojban" <lojban@yahoogroups.com>
Subject: RE: [lojban] brevity metrics
Date: Sun, 14 Apr 2002 19:31:33 +0100
Message-ID: <LPBBJKMNINKHACNDIIGMAEACFNAA.a.rosta@ntlworld.com>
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
Importance: Normal
In-Reply-To: <20020414043446.GD19164@digitalkingdom.org>
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2615.200
X-eGroups-From: "And Rosta" <a.rosta@ntlworld.com>
From: "And Rosta" <arosta@uclan.ac.uk>
X-Yahoo-Group-Post: member; u=810630
X-Yahoo-Profile: andjamin

I realize this was only a 2 minute jobbie, but remember that you
should compare translations of the text into lang X with translations
into lang Y. Comparing with the original is not a fair test.

--And.

> -----Original Message-----
> From: Robin Lee Powell [mailto:rlpowell@digitalkingdom.org]
> Sent: 14 April 2002 05:35
> To: lojban
> Subject: Re: [lojban] brevity metrics
>
>
> On Thu, Apr 11, 2002 at 02:31:45PM +0100, And Rosta wrote:
> > Robin Turner:
> > >A lot of extra words are itty-bitty cmavo which don't add much to
> > >the real length (conversely, translating English into Turkish results
> > >in fewer words, but some of them can be very long!). Another point
> > >is that Lojban _can_ #make distinctions explicit, and we tend to make
> > >it do so because we can, but it doesn't need to do so - sometimes
> > >Lojban can be amazingly terse.
> >
> > A good way of measuring brevity is to compare translations, e.g. by
> > comparing the Lojban translation of _Alice_ with translations into
> > other languages, measuring by bytes or pages. If anyone can be
> > bothered to do this, I'm sure lots of us would be interested in the
> > results.
>
> Heh.
>
> Behold, the power of unix:
>
> rlpowell@chain> grep '^ *[.a-z]' alice-??.texinfo | wc -w
> 31064
> rlpowell@chain> grep '@c .*[a-z]' alice-??.texinfo | wc -w
> 29227
>
> That took me about 2 minutes. Woot.
>
> Except it's slightly wrong. Again:
>
> rlpowell@chain> grep '^ *[.a-z]' alice-??.texinfo | sed 's/^[^;]*://' | wc -w
> 30880
> rlpowell@chain> grep '@c .*[a-z]' alice-??.texinfo | sed 's/.*:@c//' | wc -w
> 26505
>
> The first one in lojban. the second is English. Is Alice actually
> *finished*!?


