From arosta@uclan.ac.uk Sun Apr 14 11:31:14 2002 Return-Path: X-Sender: a.rosta@ntlworld.com X-Apparently-To: lojban@yahoogroups.com Received: (EGP: mail-8_0_3_1); 14 Apr 2002 18:31:14 -0000 Received: (qmail 33251 invoked from network); 14 Apr 2002 18:31:14 -0000 Received: from unknown (66.218.66.217) by m6.grp.scd.yahoo.com with QMQP; 14 Apr 2002 18:31:14 -0000 Received: from unknown (HELO mta02-svc.ntlworld.com) (62.253.162.42) by mta2.grp.scd.yahoo.com with SMTP; 14 Apr 2002 18:31:14 -0000 Received: from oemcomputer ([62.253.88.94]) by mta02-svc.ntlworld.com (InterMail vM.4.01.03.27 201-229-121-127-20010626) with SMTP id <20020414183111.FLTU286.mta02-svc.ntlworld.com@oemcomputer> for ; Sun, 14 Apr 2002 19:31:11 +0100 To: "lojban" Subject: RE: [lojban] brevity metrics Date: Sun, 14 Apr 2002 19:31:33 +0100 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0) Importance: Normal In-Reply-To: <20020414043446.GD19164@digitalkingdom.org> X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2615.200 X-eGroups-From: "And Rosta" From: "And Rosta" X-Yahoo-Group-Post: member; u=810630 X-Yahoo-Profile: andjamin X-Yahoo-Message-Num: 14002 I realize this was only a 2 minute jobbie, but remember that you should compare translations of the text into lang X with translations into lang Y. Comparing with the original is not a fair test. --And. > -----Original Message----- > From: Robin Lee Powell [mailto:rlpowell@digitalkingdom.org] > Sent: 14 April 2002 05:35 > To: lojban > Subject: Re: [lojban] brevity metrics > > > On Thu, Apr 11, 2002 at 02:31:45PM +0100, And Rosta wrote: > > Robin Turner: > > >A lot of extra words are itty-bitty cmavo which don't add much to > > >the real length (conversely, translating English into Turkish results > > >in fewer words, but some of them can be very long!). Another point > > >is that Lojban _can_ #make distinctions explicit, and we tend to make > > >it do so because we can, but it doesn't need to do so - sometimes > > >Lojban can be amazingly terse. > > > > A good way of measuring brevity is to compare translations, e.g. by > > comparing the Lojban translation of _Alice_ with translations into > > other languages, measuring by bytes or pages. If anyone can be > > bothered to do this, I'm sure lots of us would be interested in the > > results. > > Heh. > > Behold, the power of unix: > > rlpowell@chain> grep '^ *[.a-z]' alice-??.texinfo | wc -w > 31064 > rlpowell@chain> grep '@c .*[a-z]' alice-??.texinfo | wc -w > 29227 > > That took me about 2 minutes. Woot. > > Except it's slightly wrong. Again: > > rlpowell@chain> grep '^ *[.a-z]' alice-??.texinfo | sed 's/^[^;]*://' | wc -w > 30880 > rlpowell@chain> grep '@c .*[a-z]' alice-??.texinfo | sed 's/.*:@c//' | wc -w > 26505 > > The first one in lojban. the second is English. Is Alice actually > *finished*!?