Received: from mail-vw0-f61.google.com ([209.85.212.61]:65498) by stodi.digitalkingdom.org with esmtps (TLSv1:RC4-SHA:128) (Exim 4.76) (envelope-from ) id 1S5LjL-000785-PX; Wed, 07 Mar 2012 10:37:05 -0800 Received: by vbbfd1 with SMTP id fd1sf6990476vbb.16 for ; Wed, 07 Mar 2012 10:36:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=beta; h=x-beenthere:mime-version:date:in-reply-to:references:user-agent :x-http-useragent:message-id:subject:from:to:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:x-google-group-id:list-post:list-help:list-archive:sender :list-subscribe:list-unsubscribe:content-type :content-transfer-encoding; bh=b5duMrhQ3iLmmTeKvjTOUj1a0FYBQ+R6PlWwV2bQY0o=; b=eEmjCINhVwEs0KbrSpU8uUE2hYWRYL1pYaMbK/EdoO+dCtTjfKCsMlnHgK14OPEVKb qnrtd3bIlpgKAhUQRLBE623OcxizZLxDnNLh8hv2Vmais3zl1RdAs8TbViZUIjMNNatQ 18TiTOK5zDCjFKTmXtCJwaALdtjr7LweSrkNA= Received: by 10.52.71.18 with SMTP id q18mr648563vdu.14.1331145405603; Wed, 07 Mar 2012 10:36:45 -0800 (PST) X-BeenThere: lojban@googlegroups.com Received: by 10.220.4.24 with SMTP id 24ls2527133vcp.9.gmail; Wed, 07 Mar 2012 10:36:44 -0800 (PST) MIME-Version: 1.0 Received: by 10.52.94.78 with SMTP id da14mr394475vdb.0.1331145404501; Wed, 07 Mar 2012 10:36:44 -0800 (PST) Received: by s7g2000vby.googlegroups.com with HTTP; Wed, 7 Mar 2012 10:36:44 -0800 (PST) Date: Wed, 7 Mar 2012 10:36:44 -0800 (PST) In-Reply-To: <24b50624-5057-46e1-90c1-3b0ba4e4f9e5@gr6g2000vbb.googlegroups.com> References: <29741151.5374.1331043579316.JavaMail.geo-discussion-forums@vbkc1> <8f2d80fb-7cda-4645-854d-4f119e0d5726@l14g2000vbe.googlegroups.com> <20567224.17.1331117056640.JavaMail.geo-discussion-forums@ynic10> <85d85f4f-d5f5-4fe2-a278-c278b63bffe1@m2g2000vbc.googlegroups.com> <24b50624-5057-46e1-90c1-3b0ba4e4f9e5@gr6g2000vbb.googlegroups.com> User-Agent: G2/1.0 X-HTTP-UserAgent: Mozilla/5.0 (X11; Linux i686) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.65 Safari/535.11,gzip(gfe) Message-ID: <877cc974-305f-4763-8756-03768c19d643@s7g2000vby.googlegroups.com> Subject: [lojban] Re: How to export tatoeba in simple format From: ianek To: lojban X-Original-Sender: janek37@gmail.com X-Original-Authentication-Results: ls.google.com; spf=pass (google.com: domain of janek37@gmail.com designates internal as permitted sender) smtp.mail=janek37@gmail.com; dkim=pass header.i=@gmail.com Reply-To: lojban@googlegroups.com Precedence: list Mailing-list: list lojban@googlegroups.com; contact lojban+owners@googlegroups.com List-ID: X-Google-Group-Id: 1004133512417 List-Post: , List-Help: , List-Archive: Sender: lojban@googlegroups.com List-Subscribe: , List-Unsubscribe: , Content-Type: text/plain; charset=ISO-8859-2 Content-Transfer-Encoding: quoted-printable X-Spam-Score: 0.0 (/) X-Spam_score: 0.0 X-Spam_score_int: 0 X-Spam_bar: / http://dl.dropbox.com/u/17805197/jbo-rus.csv But it's probably not complete, for the reason I mentioned. On 7 Mar, 19:32, ianek wrote: > I've just found out that links.csv is not complete, ie. it doesn't > cover all the pairs. For example, we have a Lojban sentence "lo purci > ka'e te djuno gi'e na ka'e se galfi .i lo balvi ka'e se galfi gi'e na > ka'e te djuno" and a Polish sentence "Przesz=B3o=B6=E6 mo=BFe by=E6 tylko > poznana, nie zmieniona. Przysz=B3o=B6=E6 mo=BFe by=E6 tylko zmieniona, ni= e > poznana." and they're not linked to each other, but they both are > linked to "The past can only be known, not changed. The future can > only be changed, not known.". I wonder if there's a rule that such > sentence always have a "common relative", it would certainly make > things easier. But I think that now using a database (maybe sqlite3) > would be necessary. > > mu'o mi'e ianek > > On 7 Mar, 15:51, ianek wrote: > > > > > > > > > What platform? Is Linux ok? > > > On 7 Mar, 11:44, gleki wrote: > > > > I'm interested. And actually in periodically doing it myself. =A0Not = by > > > request. > > > Because the database is live and is being updated by us. > > > > Of course I know about those three files. > > > > For now, I'd prefer such export for several directions at one (a > > > multilingual spreadsheet). > > > I want all sentences for which we have lojban translations. > > > i.e. > > > first column =A0 =A0lojban > > > 2 column =A0 english > > > then i need > > > japanese > > > chinese > > > russian > > > arabic > > > spanish > > > polish > > > french > > > german > > > > I'll repeat once again. An automated script for doing so =A0would be = awesome. > > > > On Wednesday, March 7, 2012 2:47:17 AM UTC+4, ianek wrote: > > > > > I've created the list for you, but it was an ugly hack in bash. A > > > > better way would be to create a database and import sentences.csv a= nd > > > > links.csv to it, and then write a very simple program instead of > > > > hacking around with grep etc. But it would be more work of course. = And > > > > maybe not faster, considering that import would take time. > > > > > Here you go:http://dl.dropbox.com/u/17805197/jbo-eng.csv > > > > It's tab-seperated list, any spreadsheet program should read it. > > > > > As a by-product, I am able to produce such a list for any other > > > > language available in tatoeba instantly, if anyone's interested. > > > > > mu'o mi'e ianek > > > > > On 6 Mar, 22:17, ianek wrote: > > > > >http://tatoeba.org/pol/download_tatoeba_example_sentenceshttp://tato= e... > > > > > > There are actually three columns: id, language, sentence, but wit= h > > > > > some database-fu or script-fu or maybe even spreadsheet-fu you ca= n get > > > > > what you want. Or maybe I'll hack it together in a while. > > > > > > mu'o mi'e ianek > > > > > > On 6 Mar, 15:19, gleki wrote: > > > > > > > I wanna export tatoeba databse into a simple spreadsheet with t= wo > > > > columns. > > > > > > One for English and another one for Lojban > > > > > > > Does anyone know how to do that ? --=20 You received this message because you are subscribed to the Google Groups "= lojban" group. To post to this group, send email to lojban@googlegroups.com. To unsubscribe from this group, send email to lojban+unsubscribe@googlegrou= ps.com. For more options, visit this group at http://groups.google.com/group/lojban= ?hl=3Den.