Received: from mail-tul01m020-f189.google.com ([209.85.214.189]:50455) by stodi.digitalkingdom.org with esmtps (TLSv1:RC4-SHA:128) (Exim 4.76) (envelope-from ) id 1S5Lfh-00076V-PR; Wed, 07 Mar 2012 10:33:15 -0800 Received: by obbuo6 with SMTP id uo6sf7188016obb.16 for ; Wed, 07 Mar 2012 10:33:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=beta; h=x-beenthere:mime-version:date:in-reply-to:references:user-agent :x-http-useragent:message-id:subject:from:to:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:x-google-group-id:list-post:list-help:list-archive:sender :list-subscribe:list-unsubscribe:content-type :content-transfer-encoding; bh=Ue27G/wq9Qjeks7b4O8tPXJk386jO9XfVGg6phQ40rA=; b=jK2icWxBqHqW8jephG/6M/4M96mF72j4sQSZ8nVTpuEUDabsksuO7b/DaSXxSERVg+ u+Guag6rroOCWDKUqJJQBcAXJTnVCXuBMyXmAfidrIXRduEkpD0BT5YFF8s6p/x9LnZG OtFqID5pqh5JxdtSqzDgd1rO0vGok0DcOhoAM= Received: by 10.52.28.33 with SMTP id y1mr658396vdg.1.1331145179304; Wed, 07 Mar 2012 10:32:59 -0800 (PST) X-BeenThere: lojban@googlegroups.com Received: by 10.52.64.241 with SMTP id r17ls2556118vds.0.gmail; Wed, 07 Mar 2012 10:32:58 -0800 (PST) MIME-Version: 1.0 Received: by 10.52.72.72 with SMTP id b8mr637343vdv.19.1331145178087; Wed, 07 Mar 2012 10:32:58 -0800 (PST) Received: by gr6g2000vbb.googlegroups.com with HTTP; Wed, 7 Mar 2012 10:32:58 -0800 (PST) Date: Wed, 7 Mar 2012 10:32:58 -0800 (PST) In-Reply-To: <85d85f4f-d5f5-4fe2-a278-c278b63bffe1@m2g2000vbc.googlegroups.com> References: <29741151.5374.1331043579316.JavaMail.geo-discussion-forums@vbkc1> <8f2d80fb-7cda-4645-854d-4f119e0d5726@l14g2000vbe.googlegroups.com> <20567224.17.1331117056640.JavaMail.geo-discussion-forums@ynic10> <85d85f4f-d5f5-4fe2-a278-c278b63bffe1@m2g2000vbc.googlegroups.com> User-Agent: G2/1.0 X-HTTP-UserAgent: Mozilla/5.0 (X11; Linux i686) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.65 Safari/535.11,gzip(gfe) Message-ID: <24b50624-5057-46e1-90c1-3b0ba4e4f9e5@gr6g2000vbb.googlegroups.com> Subject: [lojban] Re: How to export tatoeba in simple format From: ianek To: lojban X-Original-Sender: janek37@gmail.com X-Original-Authentication-Results: ls.google.com; spf=pass (google.com: domain of janek37@gmail.com designates internal as permitted sender) smtp.mail=janek37@gmail.com; dkim=pass header.i=@gmail.com Reply-To: lojban@googlegroups.com Precedence: list Mailing-list: list lojban@googlegroups.com; contact lojban+owners@googlegroups.com List-ID: X-Google-Group-Id: 1004133512417 List-Post: , List-Help: , List-Archive: Sender: lojban@googlegroups.com List-Subscribe: , List-Unsubscribe: , Content-Type: text/plain; charset=ISO-8859-2 Content-Transfer-Encoding: quoted-printable X-Spam-Score: 0.0 (/) X-Spam_score: 0.0 X-Spam_score_int: 0 X-Spam_bar: / I've just found out that links.csv is not complete, ie. it doesn't cover all the pairs. For example, we have a Lojban sentence "lo purci ka'e te djuno gi'e na ka'e se galfi .i lo balvi ka'e se galfi gi'e na ka'e te djuno" and a Polish sentence "Przesz=B3o=B6=E6 mo=BFe by=E6 tylko poznana, nie zmieniona. Przysz=B3o=B6=E6 mo=BFe by=E6 tylko zmieniona, nie poznana." and they're not linked to each other, but they both are linked to "The past can only be known, not changed. The future can only be changed, not known.". I wonder if there's a rule that such sentence always have a "common relative", it would certainly make things easier. But I think that now using a database (maybe sqlite3) would be necessary. mu'o mi'e ianek On 7 Mar, 15:51, ianek wrote: > What platform? Is Linux ok? > > On 7 Mar, 11:44, gleki wrote: > > > > > > > > > I'm interested. And actually in periodically doing it myself. =A0Not by > > request. > > Because the database is live and is being updated by us. > > > Of course I know about those three files. > > > For now, I'd prefer such export for several directions at one (a > > multilingual spreadsheet). > > I want all sentences for which we have lojban translations. > > i.e. > > first column =A0 =A0lojban > > 2 column =A0 english > > then i need > > japanese > > chinese > > russian > > arabic > > spanish > > polish > > french > > german > > > I'll repeat once again. An automated script for doing so =A0would be aw= esome. > > > On Wednesday, March 7, 2012 2:47:17 AM UTC+4, ianek wrote: > > > > I've created the list for you, but it was an ugly hack in bash. A > > > better way would be to create a database and import sentences.csv and > > > links.csv to it, and then write a very simple program instead of > > > hacking around with grep etc. But it would be more work of course. An= d > > > maybe not faster, considering that import would take time. > > > > Here you go:http://dl.dropbox.com/u/17805197/jbo-eng.csv > > > It's tab-seperated list, any spreadsheet program should read it. > > > > As a by-product, I am able to produce such a list for any other > > > language available in tatoeba instantly, if anyone's interested. > > > > mu'o mi'e ianek > > > > On 6 Mar, 22:17, ianek wrote: > > > >http://tatoeba.org/pol/download_tatoeba_example_sentenceshttp://tatoe.= .. > > > > > There are actually three columns: id, language, sentence, but with > > > > some database-fu or script-fu or maybe even spreadsheet-fu you can = get > > > > what you want. Or maybe I'll hack it together in a while. > > > > > mu'o mi'e ianek > > > > > On 6 Mar, 15:19, gleki wrote: > > > > > > I wanna export tatoeba databse into a simple spreadsheet with two > > > columns. > > > > > One for English and another one for Lojban > > > > > > Does anyone know how to do that ? --=20 You received this message because you are subscribed to the Google Groups "= lojban" group. To post to this group, send email to lojban@googlegroups.com. To unsubscribe from this group, send email to lojban+unsubscribe@googlegrou= ps.com. For more options, visit this group at http://groups.google.com/group/lojban= ?hl=3Den.