From bpfk-list+bncCK30vq5WELDzr-QEGgT7lXA-@googlegroups.com Sat Sep 11 14:51:04 2010 Received: from mail-pz0-f61.google.com ([209.85.210.61]) by chain.digitalkingdom.org with esmtp (Exim 4.72) (envelope-from ) id 1OuXyF-0003Rd-69; Sat, 11 Sep 2010 14:51:04 -0700 Received: by pzk7 with SMTP id 7sf325580pzk.16 for ; Sat, 11 Sep 2010 14:50:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=beta; h=domainkey-signature:received:x-beenthere:received:received:received :received:received-spf:received:received:date:from:to:subject :message-id:reply-to:mail-followup-to:mime-version:user-agent :x-original-sender:x-original-authentication-results:precedence :mailing-list:list-id:list-post:list-help:list-archive:sender :list-subscribe:list-unsubscribe:content-type:content-disposition :content-transfer-encoding; bh=coKu3wywzWmDPeTBnSjhSSdKSEWH1gHzEZVEsEi2HnY=; b=EguhBfx+mIJPMKOpPL2pj3djxN5YpbGgit1OEkcrDG4rirjhGnWq5jOUF9bcak0zjT VJZGPdDUFbQCIxmt19LRa5y08Zickn1YQKwmaFStTOqFQU2j2QIWwLGukaa/fqzXoxj9 0sP1Ha7bizCiTjOpOkoan37z5aklv/d8Pr8b4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlegroups.com; s=beta; h=x-beenthere:received-spf:date:from:to:subject:message-id:reply-to :mail-followup-to:mime-version:user-agent:x-original-sender :x-original-authentication-results:precedence:mailing-list:list-id :list-post:list-help:list-archive:sender:list-subscribe :list-unsubscribe:content-type:content-disposition :content-transfer-encoding; b=CM+Wa4bEPt/kfUcGICfOSHzeM20vZikx2YB83fhX1/OXP2we7n3QR+lwr07TkotxAU HSyE0aP85cioFU5mt+PFPJexfSb9tMiwOc7mu52UXYQl4JNePmkZU0h/HotygCNuFcrq BSg8hhbK3gQt3PHmGa4U/LI4mmp6JzW4POowI= Received: by 10.142.120.16 with SMTP id s16mr79080wfc.41.1284241840054; Sat, 11 Sep 2010 14:50:40 -0700 (PDT) X-BeenThere: bpfk-list@googlegroups.com Received: by 10.142.117.2 with SMTP id p2ls5047475wfc.1.p; Sat, 11 Sep 2010 14:50:37 -0700 (PDT) Received: by 10.142.201.17 with SMTP id y17mr602828wff.48.1284241837808; Sat, 11 Sep 2010 14:50:37 -0700 (PDT) Received: by 10.142.201.17 with SMTP id y17mr602827wff.48.1284241837782; Sat, 11 Sep 2010 14:50:37 -0700 (PDT) Received: from chain.digitalkingdom.org (chain.digitalkingdom.org [64.81.66.169]) by gmr-mx.google.com with ESMTP id t33si5120316wfc.0.2010.09.11.14.50.37; Sat, 11 Sep 2010 14:50:37 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of nobody@digitalkingdom.org designates 64.81.66.169 as permitted sender) client-ip=64.81.66.169; Received: from nobody by chain.digitalkingdom.org with local (Exim 4.72) (envelope-from ) id 1OuXy0-0003RG-Iz for bpfk-list@googlegroups.com; Sat, 11 Sep 2010 14:50:36 -0700 Received: from rlpowell by chain.digitalkingdom.org with local (Exim 4.72) (envelope-from ) id 1OuXxz-0003R1-R8; Sat, 11 Sep 2010 14:50:36 -0700 Date: Sat, 11 Sep 2010 14:50:35 -0700 From: Robin Lee Powell To: lojban-list@lojban.org, bpfk@lojban.org, jbovlaste@lojban.org Subject: [bpfk] Technical, Help Request: What information *should* a Lojban dictionary system have? Message-ID: <20100911215035.GG13937@digitalkingdom.org> Reply-To: bpfk-list@googlegroups.com Mail-Followup-To: lojban-list@lojban.org, bpfk@lojban.org, jbovlaste@lojban.org MIME-Version: 1.0 User-Agent: Mutt/1.5.20 (2009-06-14) X-Original-Sender: rlpowell@digitalkingdom.org X-Original-Authentication-Results: gmr-mx.google.com; spf=pass (google.com: best guess record for domain of nobody@digitalkingdom.org designates 64.81.66.169 as permitted sender) smtp.mail=nobody@digitalkingdom.org Precedence: list Mailing-list: list bpfk-list@googlegroups.com; contact bpfk-list+owners@googlegroups.com List-ID: List-Post: , List-Help: , List-Archive: Sender: bpfk-list@googlegroups.com List-Subscribe: , List-Unsubscribe: , Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable (*Please* redirect all followups to the main list (I'd say the jbovlaste list, but that's a lot harder to get on, so...)) Some of us have had brief chats about what a re-done jbovlaste would look like. The UI part is pretty well understood, in as much as web UIs are decently consistent these days and besides, people like http://vlasisku.lojban.org/, so that provides a good starting point. Much more interesting to me is the back-end data: What sorts of things *should* a Lojbanic dictionary store, ideally? What got this started is the realization that Lojban isn't English, and that, in particular, the brivla definitions seem anti-Lojbanic. When I see x1 gets/procures/acquires/obtains/accepts x2 from source x3 that kind of looks to me like a verb; I see the big thing in the middle as being "the meaning" of "the verb". Lojban isn't like that: brivla are as much or more about the *places* than about the central meaning-concept. This lead to me wondering what a definition format that really focused on the places would look like; I don't really have an answer yet, but this in turn lead to a lot of other stuff. In particular, it seemed to me that if you had the right kind of information about the places, you could generate the sort of definiton I pasted above automatically from that. Then we had the smart.fm thing, which made it obvious that not all definitions suit all situations; it was very important there to pare the definitions down to bare essentials. It was also a giant pain. So I got to thinking about what sort of data we'd have to have to generate different levels of detail in the definitions. As part of that, I ended up extracting some data from jbofihe, some of the data it uses to generate English glosses, like this: [([klama1 (go-er(s)):] mi /I, me/) /[is, does]/ <> ([klam= a2 (destination(s)):] le /the/ zarci /trading place(s)/)] Which is kind of ugly, but if you strip out anything that's not between /.../, you get: I, me [is, does] go-ing the trading place(s) which is really rather good. Good enough that one of my girlfriends, who has never studied a word of Lojban, reads my blog posts that way. So this left me thinking that I want dictionary software which could, given the right data, serve *all* of these purposes: formal dictionary definitons, casual definitions, and glossing (which implies very detailed information about the individual places). I don't know exactly what this looks like, but I *think* we can get all that by just talking about the places themselvles. The resulting formal definiton might look a bit different; I'm not sure yet, which is why I'm posting this: I want help coming up with something awesome. A reasonable starting point for discussion is what jbovlaste uses to generate its glosses, I think: # x1 gets/procures/acquires/obtains/accepts x2 from source x3 [previous pos= sessor not implied] cpacu1:A;acquire cpacu2:P;acquired cpacu3:D;source* of acquisition cpacu3t:source And here's what the letters mean: =E2=94=82 =E2=94=82=E2=94=82 =E2=94=82 = =E2=94=82 =E2=94=82 Letter =E2=94=82 Type =E2=94=82=E2=94=82 Noun =E2=94=82 = Verb =E2=94=82 Qualifier =E2=94=82 Tag =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=BC=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=BC=E2=94=BC=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=BC=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=BC=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=BC=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80 A =E2=94=82 Act =E2=94=82=E2=94=82 X-er(s) =E2=94=82 = X-ing =E2=94=82 X-ing =E2=94=82 X-er(s) D =E2=94=82 Discrete =E2=94=82=E2=94=82 X(s) =E2=94=82 = being X =E2=94=82 X =E2=94=82 X S =E2=94=82 Substance =E2=94=82=E2=94=82 X =E2=94=82 = being X =E2=94=82 X =E2=94=82 X P =E2=94=82 Property =E2=94=82=E2=94=82 X thing(s) =E2=94=82 = being X =E2=94=82 X =E2=94=82 X thing(s) R =E2=94=82 Rev. prop =E2=94=82=E2=94=82 thing(s) X =E2=94=82 = being X =E2=94=82 X =E2=94=82 things(s) X I =E2=94=82 Idiomatic =E2=94=82=E2=94=82 thing(s) X-ing =E2=94=82 = X-ing =E2=94=82 X-ing =E2=94=82 thing(s) X-ing E =E2=94=82 Event =E2=94=82=E2=94=82 X(s) =E2=94=82 = being X =E2=94=82 X =E2=94=82 X That actual format is .. not great :), but the information is fantastic. How can we expand that so that we could, in theory, have enough information to serve all masters? What would the resulting dictionary definitions look like? -Robin --=20 http://singinst.org/ : Our last, best hope for a fantastic future. Lojban (http://www.lojban.org/): The language in which "this parrot is dead" is "ti poi spitaki cu morsi", but "this sentence is false" is "na nei". My personal page: http://www.digitalkingdom.org/rlp/ --=20 You received this message because you are subscribed to the Google Groups "= BPFK" group. To post to this group, send email to bpfk-list@googlegroups.com. To unsubscribe from this group, send email to bpfk-list+unsubscribe@googleg= roups.com. For more options, visit this group at http://groups.google.com/group/bpfk-l= ist?hl=3Den.