From lojban+bncCN673cmqFBDUurDkBBoEnO8ZvQ@googlegroups.com Sat Sep 11 17:22:58 2010 Received: from mail-pv0-f189.google.com ([74.125.83.189]) by chain.digitalkingdom.org with esmtp (Exim 4.72) (envelope-from ) id 1OuaLP-0004WY-1a; Sat, 11 Sep 2010 17:22:58 -0700 Received: by pvc7 with SMTP id 7sf328523pvc.16 for ; Sat, 11 Sep 2010 17:22:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=beta; h=domainkey-signature:received:x-beenthere:received:received:received :received:received-spf:received:received:received:received :x-vr-score:x-authority-analysis:x-cm-score:message-id:date:from :user-agent:x-accept-language:mime-version:to:subject:references :in-reply-to:x-original-sender:x-original-authentication-results :reply-to:precedence:mailing-list:list-id:list-post:list-help :list-archive:sender:list-subscribe:list-unsubscribe:content-type; bh=6dCuJOWVyLPxb0oy9POkWykSAaO6JXWL+CUe658OJcQ=; b=oaZFtQoQ5yaL70wd+2QCEqfEKPQi78JQAmC3800W+BVDM6PGl3jBtQABvukQ4ktpYp tAs9JAQd2rgXxnzt/RepBKQPM1eGSIEd2JFvzIeB3eOv68NS5A/TCMwxfqLZwwSTb0xn wfhT0e+dKuYJG1yCYofOeVqh2Ge8EPgkrDaxk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlegroups.com; s=beta; h=x-beenthere:received-spf:x-vr-score:x-authority-analysis:x-cm-score :message-id:date:from:user-agent:x-accept-language:mime-version:to :subject:references:in-reply-to:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:sender:list-subscribe :list-unsubscribe:content-type; b=aLGTiOERUG4aNsYH+/oXg36wthHTG5mZyX8yP1I8JYa+L2U2lXjG2NgRz5kZcutQEU mNsojOC3HcPzzZNS/IR/vth18LwryIzQzcIxQhChtuxL2dzgm7UtWP0zwjrYcmulqXS+ HouqdGMKG1e64h4G95o5wP6v00QQaEXanpk4E= Received: by 10.115.38.30 with SMTP id q30mr138265waj.7.1284250964830; Sat, 11 Sep 2010 17:22:44 -0700 (PDT) X-BeenThere: lojban@googlegroups.com Received: by 10.115.98.19 with SMTP id a19ls10124526wam.1.p; Sat, 11 Sep 2010 17:22:43 -0700 (PDT) Received: by 10.114.132.17 with SMTP id f17mr636230wad.39.1284250963707; Sat, 11 Sep 2010 17:22:43 -0700 (PDT) Received: by 10.114.132.17 with SMTP id f17mr636229wad.39.1284250963672; Sat, 11 Sep 2010 17:22:43 -0700 (PDT) Received: from chain.digitalkingdom.org (chain.digitalkingdom.org [64.81.66.169]) by gmr-mx.google.com with ESMTP id k37si5293161wae.6.2010.09.11.17.22.43; Sat, 11 Sep 2010 17:22:43 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of nobody@digitalkingdom.org designates 64.81.66.169 as permitted sender) client-ip=64.81.66.169; Received: from nobody by chain.digitalkingdom.org with local (Exim 4.72) (envelope-from ) id 1OuaLC-0004WT-Fp for lojban@googlegroups.com; Sat, 11 Sep 2010 17:22:42 -0700 Received: from eastrmmtao104.cox.net ([68.230.240.46]) by chain.digitalkingdom.org with esmtp (Exim 4.72) (envelope-from ) id 1OuaL8-0004W9-Qv for lojban-list@lojban.org; Sat, 11 Sep 2010 17:22:42 -0700 Received: from eastrmimpo03.cox.net ([68.1.16.126]) by eastrmmtao104.cox.net (InterMail vM.8.00.01.00 201-2244-105-20090324) with ESMTP id <20100912002232.UTPX16482.eastrmmtao104.cox.net@eastrmimpo03.cox.net> for ; Sat, 11 Sep 2010 20:22:32 -0400 Received: from [192.168.0.101] ([70.179.118.163]) by eastrmimpo03.cox.net with bizsmtp id 5cNX1f00D3Xcbvq02cNXN2; Sat, 11 Sep 2010 20:22:32 -0400 X-VR-Score: -200.00 X-Authority-Analysis: v=1.1 cv=yC2qc2vQlv8sGZde8H6j+ptPXNXumo3F9UlhLmG4EBA= c=1 sm=1 a=U1r8txaOyYEA:10 a=IkcTkHD0fZMA:10 a=7ls7RdmwX4RvLZNVULbZcg==:17 a=8YJikuA2AAAA:8 a=PjBQz84L0TfypTjf_m0A:9 a=3JoVYxIVg2KXXnh--lQA:7 a=CE3ECgKDLXidjT3WZqjQATosZc0A:4 a=QEXdDO2ut3YA:10 a=EhZVUOeOntkA:10 a=QP0qerpQ6rYrBNfP:21 a=h29Rg_0C342P6D2C:21 a=7ls7RdmwX4RvLZNVULbZcg==:117 X-CM-Score: 0.00 Message-ID: <4C8C1DAD.1020709@lojban.org> Date: Sat, 11 Sep 2010 20:24:13 -0400 From: Robert LeChevalier User-Agent: Mozilla Thunderbird 1.0.7 (Windows/20050923) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Lojban List Subject: [lojban] Re: [bpfk] Technical, Help Request: What information *should* a Lojban dictionary system have? References: <20100911215035.GG13937@digitalkingdom.org> In-Reply-To: <20100911215035.GG13937@digitalkingdom.org> X-Original-Sender: lojbab@lojban.org X-Original-Authentication-Results: gmr-mx.google.com; spf=pass (google.com: best guess record for domain of nobody@digitalkingdom.org designates 64.81.66.169 as permitted sender) smtp.mail=nobody@digitalkingdom.org Reply-To: lojban@googlegroups.com Precedence: list Mailing-list: list lojban@googlegroups.com; contact lojban+owners@googlegroups.com List-ID: List-Post: , List-Help: , List-Archive: Sender: lojban@googlegroups.com List-Subscribe: , List-Unsubscribe: , Content-Type: text/plain; charset=ISO-8859-1; format=flowed Robin Lee Powell wrote: > Much more interesting to me is the back-end data: What sorts of > things *should* a Lojbanic dictionary store, ideally? > > What got this started is the realization that Lojban isn't English, > and that, in particular, the brivla definitions seem anti-Lojbanic. > When I see > > x1 gets/procures/acquires/obtains/accepts x2 from source x3 > > that kind of looks to me like a verb; I see the big thing in the > middle as being "the meaning" of "the verb". > > Lojban isn't like that: brivla are as much or more about the > *places* than about the central meaning-concept. > > This lead to me wondering what a definition format that really > focused on the places would look like; I don't really have an answer > yet, but this in turn lead to a lot of other stuff. A lot of thought, as well as research on dictionary writing went into the format I used, which is what appears above. The decisive factor that led to the above was brevity (which is a factor in most dictionaries that see print) If people are ONLY going to look at dictionaries on-line, then brevity may be less important, but if there is ever to be a printed (or printable) dictionary, with a significant number of entries, something like the above becomes necessary. Another thing to remember - while Lojban isn't English, until we get to Lojban-only dictionaries, all dictionaries are translation dictionaries, and there are very different rules because of how such dictionaries are used. There, the key to format is the target language - the one in which people will be focused on when translating. One would therefore expect a different format for the English-to-Lojban side than for the Lojban-to-English side. I tried for this in the quasi-automated dictionary format, which used keyword manipulation to turn the above into a series of entries like: procurer: x1 of cpacu; (followed by some form of the definition above) acquirer: x1 of cpacu; (followed by some form of the definition above) obtainer: x1 of cpacu; ... accepter: x1 of cpacu; ... acquisition: x2 of cpacu ... source of acquisition: x3 of cpacu ... acquisition: nu event of cpacu ... acquisition: nu event of cpacu ... acquisition: pu'u process of cpacu ... get : x1+x2 of cpacu ... (meaning that at least those two places must be filled in order to translate the English) etc. This turns each lojban-to-English definition into a rich multitude of English to Lojban definitions. It can take a lot of work, even with something like Cowan's perl script that did the keyword manipulation, and you have to think about which keywords are likely to useful. lojban.org should have the files I generated doing this work somewhere, so people can see what the result looked like. It wasn't necessarily all that pretty, but it met the functional need. And the automated processing made it possible to create a dictionary in less than several lifetimes. I won't pretend to know how to apply these insights into online dictionaries, since the only kind I ever use are those that display entries looking like regular English entries (a formatting style that developed over a couple hundred years of dictionary writing). lojbab -- You received this message because you are subscribed to the Google Groups "lojban" group. To post to this group, send email to lojban@googlegroups.com. To unsubscribe from this group, send email to lojban+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/lojban?hl=en.