From lojban+bncCK30vq5WEK_zr-QEGgT9_p6v@googlegroups.com Sat Sep 11 14:51:07 2010 Received: from mail-pv0-f189.google.com ([74.125.83.189]) by chain.digitalkingdom.org with esmtp (Exim 4.72) (envelope-from ) id 1OuXyE-0003Rc-OT; Sat, 11 Sep 2010 14:51:07 -0700 Received: by pvc7 with SMTP id 7sf323986pvc.16 for ; Sat, 11 Sep 2010 14:50:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=beta; h=domainkey-signature:received:x-beenthere:received:received:received :received:received-spf:received:received:date:from:to:subject :message-id:reply-to:mail-followup-to:mime-version:user-agent :x-original-sender:x-original-authentication-results:precedence :mailing-list:list-id:list-post:list-help:list-archive:sender :list-subscribe:list-unsubscribe:content-type:content-disposition :content-transfer-encoding; bh=YWtK5Xcgb+pRBkjBxRPPyAb1czejbqBxL/l+IU9hN4s=; b=BeLTHct4gCD45FvOh6XmiPPG8BBhD79xfpZS48rabtnV7rk2UQof10oMONslwdmzzI PBgOEXNj9A2Mq1vBFTPAzbpxXoYzye+HR6v5hZC7IFn/mgAt9N26KrKLAkM4Ek6dM9ij /xPo6Q2y8T4IqM9NmRkkseH+2ZN2L1m4Q9k3Y= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlegroups.com; s=beta; h=x-beenthere:received-spf:date:from:to:subject:message-id:reply-to :mail-followup-to:mime-version:user-agent:x-original-sender :x-original-authentication-results:precedence:mailing-list:list-id :list-post:list-help:list-archive:sender:list-subscribe :list-unsubscribe:content-type:content-disposition :content-transfer-encoding; b=gnz4Y3OgYmySDlNKeFkT0Np7a6vqePJw2fZCIeS7FwrLwUCVIYB6qF5drgKcJPNzzK /A3JMMVkvliA0KTj9S3Iq6qEez5y2y7wT5Sz70iOIrIfhMRKhVu2QYkX40L/W6JGTCHj j1iQBA3DMxwySP36T9WjNnsGs9k9nI/JME7u0= Received: by 10.142.249.30 with SMTP id w30mr83259wfh.45.1284241839835; Sat, 11 Sep 2010 14:50:39 -0700 (PDT) X-BeenThere: lojban@googlegroups.com Received: by 10.142.6.9 with SMTP id 9ls5040654wff.3.p; Sat, 11 Sep 2010 14:50:37 -0700 (PDT) Received: by 10.142.201.20 with SMTP id y20mr590671wff.35.1284241837932; Sat, 11 Sep 2010 14:50:37 -0700 (PDT) Received: by 10.142.201.20 with SMTP id y20mr590670wff.35.1284241837906; Sat, 11 Sep 2010 14:50:37 -0700 (PDT) Received: from chain.digitalkingdom.org (chain.digitalkingdom.org [64.81.66.169]) by gmr-mx.google.com with ESMTP id k8si5117873wfa.2.2010.09.11.14.50.37; Sat, 11 Sep 2010 14:50:37 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of nobody@digitalkingdom.org designates 64.81.66.169 as permitted sender) client-ip=64.81.66.169; Received: from nobody by chain.digitalkingdom.org with local (Exim 4.72) (envelope-from ) id 1OuXy0-0003RM-KO for lojban@googlegroups.com; Sat, 11 Sep 2010 14:50:36 -0700 Received: from rlpowell by chain.digitalkingdom.org with local (Exim 4.72) (envelope-from ) id 1OuXxz-0003R1-R8; Sat, 11 Sep 2010 14:50:36 -0700 Date: Sat, 11 Sep 2010 14:50:35 -0700 From: Robin Lee Powell To: lojban-list@lojban.org, bpfk@lojban.org, jbovlaste@lojban.org Subject: [lojban] Technical, Help Request: What information *should* a Lojban dictionary system have? Message-ID: <20100911215035.GG13937@digitalkingdom.org> Reply-To: lojban@googlegroups.com Mail-Followup-To: lojban-list@lojban.org, bpfk@lojban.org, jbovlaste@lojban.org MIME-Version: 1.0 User-Agent: Mutt/1.5.20 (2009-06-14) X-Original-Sender: rlpowell@digitalkingdom.org X-Original-Authentication-Results: gmr-mx.google.com; spf=pass (google.com: best guess record for domain of nobody@digitalkingdom.org designates 64.81.66.169 as permitted sender) smtp.mail=nobody@digitalkingdom.org Precedence: list Mailing-list: list lojban@googlegroups.com; contact lojban+owners@googlegroups.com List-ID: List-Post: , List-Help: , List-Archive: Sender: lojban@googlegroups.com List-Subscribe: , List-Unsubscribe: , Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable (*Please* redirect all followups to the main list (I'd say the jbovlaste list, but that's a lot harder to get on, so...)) Some of us have had brief chats about what a re-done jbovlaste would look like. The UI part is pretty well understood, in as much as web UIs are decently consistent these days and besides, people like http://vlasisku.lojban.org/, so that provides a good starting point. Much more interesting to me is the back-end data: What sorts of things *should* a Lojbanic dictionary store, ideally? What got this started is the realization that Lojban isn't English, and that, in particular, the brivla definitions seem anti-Lojbanic. When I see x1 gets/procures/acquires/obtains/accepts x2 from source x3 that kind of looks to me like a verb; I see the big thing in the middle as being "the meaning" of "the verb". Lojban isn't like that: brivla are as much or more about the *places* than about the central meaning-concept. This lead to me wondering what a definition format that really focused on the places would look like; I don't really have an answer yet, but this in turn lead to a lot of other stuff. In particular, it seemed to me that if you had the right kind of information about the places, you could generate the sort of definiton I pasted above automatically from that. Then we had the smart.fm thing, which made it obvious that not all definitions suit all situations; it was very important there to pare the definitions down to bare essentials. It was also a giant pain. So I got to thinking about what sort of data we'd have to have to generate different levels of detail in the definitions. As part of that, I ended up extracting some data from jbofihe, some of the data it uses to generate English glosses, like this: [([klama1 (go-er(s)):] mi /I, me/) /[is, does]/ <> ([klam= a2 (destination(s)):] le /the/ zarci /trading place(s)/)] Which is kind of ugly, but if you strip out anything that's not between /.../, you get: I, me [is, does] go-ing the trading place(s) which is really rather good. Good enough that one of my girlfriends, who has never studied a word of Lojban, reads my blog posts that way. So this left me thinking that I want dictionary software which could, given the right data, serve *all* of these purposes: formal dictionary definitons, casual definitions, and glossing (which implies very detailed information about the individual places). I don't know exactly what this looks like, but I *think* we can get all that by just talking about the places themselvles. The resulting formal definiton might look a bit different; I'm not sure yet, which is why I'm posting this: I want help coming up with something awesome. A reasonable starting point for discussion is what jbovlaste uses to generate its glosses, I think: # x1 gets/procures/acquires/obtains/accepts x2 from source x3 [previous pos= sessor not implied] cpacu1:A;acquire cpacu2:P;acquired cpacu3:D;source* of acquisition cpacu3t:source And here's what the letters mean: =E2=94=82 =E2=94=82=E2=94=82 =E2=94=82 = =E2=94=82 =E2=94=82 Letter =E2=94=82 Type =E2=94=82=E2=94=82 Noun =E2=94=82 = Verb =E2=94=82 Qualifier =E2=94=82 Tag =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=BC=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=BC=E2=94=BC=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=BC=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=BC=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=BC=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80 A =E2=94=82 Act =E2=94=82=E2=94=82 X-er(s) =E2=94=82 = X-ing =E2=94=82 X-ing =E2=94=82 X-er(s) D =E2=94=82 Discrete =E2=94=82=E2=94=82 X(s) =E2=94=82 = being X =E2=94=82 X =E2=94=82 X S =E2=94=82 Substance =E2=94=82=E2=94=82 X =E2=94=82 = being X =E2=94=82 X =E2=94=82 X P =E2=94=82 Property =E2=94=82=E2=94=82 X thing(s) =E2=94=82 = being X =E2=94=82 X =E2=94=82 X thing(s) R =E2=94=82 Rev. prop =E2=94=82=E2=94=82 thing(s) X =E2=94=82 = being X =E2=94=82 X =E2=94=82 things(s) X I =E2=94=82 Idiomatic =E2=94=82=E2=94=82 thing(s) X-ing =E2=94=82 = X-ing =E2=94=82 X-ing =E2=94=82 thing(s) X-ing E =E2=94=82 Event =E2=94=82=E2=94=82 X(s) =E2=94=82 = being X =E2=94=82 X =E2=94=82 X That actual format is .. not great :), but the information is fantastic. How can we expand that so that we could, in theory, have enough information to serve all masters? What would the resulting dictionary definitions look like? -Robin --=20 http://singinst.org/ : Our last, best hope for a fantastic future. Lojban (http://www.lojban.org/): The language in which "this parrot is dead" is "ti poi spitaki cu morsi", but "this sentence is false" is "na nei". My personal page: http://www.digitalkingdom.org/rlp/ --=20 You received this message because you are subscribed to the Google Groups "= lojban" group. To post to this group, send email to lojban@googlegroups.com. To unsubscribe from this group, send email to lojban+unsubscribe@googlegrou= ps.com. For more options, visit this group at http://groups.google.com/group/lojban= ?hl=3Den.