From lojban+bncCJGY6cDlFhDz69DeBBoEiiIVBQ@googlegroups.com Sun Apr 25 05:31:15 2010
Received: from mail-ww0-f61.google.com ([74.125.82.61])
	by chain.digitalkingdom.org with esmtp (Exim 4.71)
	(envelope-from <lojban+bncCJGY6cDlFhDz69DeBBoEiiIVBQ@googlegroups.com>)
	id 1O60zJ-00044o-4l; Sun, 25 Apr 2010 05:31:14 -0700
Received: by wwb34 with SMTP id 34sf1047550wwb.16
        for <multiple recipients>; Sun, 25 Apr 2010 05:30:58 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=googlegroups.com; s=beta;
        h=domainkey-signature:received:x-beenthere:received:received:received
         :received:received-spf:received:received:mime-version:received
         :in-reply-to:references:from:date:message-id:subject:to
         :x-original-authentication-results:x-original-sender:reply-to
         :precedence:mailing-list:list-id:list-post:list-help:list-archive
         :sender:list-subscribe:list-unsubscribe:content-type;
        bh=WM/MMTi7jxFYqvtj0lpVnpmMwThAkSZ7yBOj2MbvAF4=;
        b=okZmu1Z5RONoIXhaI4qE2k9byTcQQq2rJxkn3JUQ07rnmEsl91b9ZDjfSMigurLgTj
         8mdgePasGfRdSckIu3tYCRTR0i7Uo9q3Lr3DmHPb0nHH5UHEcgZNfHZwD1DA0goeHvKL
         mD9hI129pjE+2L7r37WIx7DgqVi1O8WPhSNSs=
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=googlegroups.com; s=beta;
        h=x-beenthere:received-spf:mime-version:in-reply-to:references:from
         :date:message-id:subject:to:x-original-authentication-results
         :x-original-sender:reply-to:precedence:mailing-list:list-id
         :list-post:list-help:list-archive:sender:list-subscribe
         :list-unsubscribe:content-type;
        b=AIL7dlQ9ZrS1tvsARq0IbWjdqa3TFNhPkNImUaxTXrZTOG8Aen0/iSbfLEj6l9XYmV
         zOORUDFtHkMT+i2bCe9iegCus1y6Sj4UhYGJqQ9w7knK8pi2ivI8P6Kk6ekZDLTTP5Nx
         XiPs3WgNQjRUbSdfOxPm5CrsJ/VGkeX1xDdbM=
Received: by 10.223.16.87 with SMTP id n23mr756889faa.19.1272198643447;
        Sun, 25 Apr 2010 05:30:43 -0700 (PDT)
X-BeenThere: lojban@googlegroups.com
Received: by 10.204.35.68 with SMTP id o4ls13379956bkd.1.p; Sun, 25 Apr 2010 
	05:30:42 -0700 (PDT)
Received: by 10.204.48.214 with SMTP id s22mr124776bkf.6.1272198641217;
        Sun, 25 Apr 2010 05:30:41 -0700 (PDT)
Received: by 10.204.48.214 with SMTP id s22mr124775bkf.6.1272198641144;
        Sun, 25 Apr 2010 05:30:41 -0700 (PDT)
Received: from mail-pz0-f181.google.com (mail-pz0-f181.google.com [209.85.222.181])
        by gmr-mx.google.com with ESMTP id 11si375145bwz.2.2010.04.25.05.30.39;
        Sun, 25 Apr 2010 05:30:40 -0700 (PDT)
Received-SPF: pass (google.com: domain of get.oren@gmail.com designates 209.85.222.181 as permitted sender) client-ip=209.85.222.181;
Received: by mail-pz0-f181.google.com with SMTP id 11so5287533pzk.28
        for <lojban@googlegroups.com>; Sun, 25 Apr 2010 05:30:38 -0700 (PDT)
Received: by 10.142.6.33 with SMTP id 33mr1195918wff.135.1272198638159; Sun, 
	25 Apr 2010 05:30:38 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.143.5.17 with HTTP; Sun, 25 Apr 2010 05:23:08 -0700 (PDT)
In-Reply-To: <20100419003042.18292qbr1ugbmnqc@www.kattare.com>
References: <g2j27513e551004170752o36658667k596c24bd8402d21e@mail.gmail.com> 
	<20100419003042.18292qbr1ugbmnqc@www.kattare.com>
From: Oren <get.oren@gmail.com>
Date: Sun, 25 Apr 2010 20:23:08 +0800
Message-ID: <j2m27513e551004250523h64815437m3c66ab3a32111dde@mail.gmail.com>
Subject: Re: [lojban] Questions about jorne
To: lojban@googlegroups.com
X-Original-Authentication-Results: gmr-mx.google.com; spf=pass (google.com: 
	domain of get.oren@gmail.com designates 209.85.222.181 as permitted sender) 
	smtp.mail=get.oren@gmail.com; dkim=pass (test mode) header.i=@gmail.com
X-Original-Sender: get.oren@gmail.com
Reply-To: lojban@googlegroups.com
Precedence: list
Mailing-list: list lojban@googlegroups.com; contact lojban+owners@googlegroups.com
List-ID: <lojban.googlegroups.com>
List-Post: <http://groups.google.com/group/lojban/post?hl=en_US>, 
	<mailto:lojban@googlegroups.com>
List-Help: <http://groups.google.com/support/?hl=en_US>, <mailto:lojban+help@googlegroups.com>
List-Archive: <http://groups.google.com/group/lojban?hl=en_US>
Sender: lojban@googlegroups.com
List-Subscribe: <http://groups.google.com/group/lojban/subscribe?hl=en_US>, 
	<mailto:lojban+subscribe@googlegroups.com>
List-Unsubscribe: <http://groups.google.com/group/lojban/subscribe?hl=en_US>, 
	<mailto:lojban+unsubscribe@googlegroups.com>
Content-Type: multipart/alternative; boundary=00504502b172e9464f04850ed35a

--00504502b172e9464f04850ed35a
Content-Type: text/plain; charset=ISO-8859-1

coi uiban

Sorry for not getting back to you sooner!

[snippet A]

On Mon, Apr 19, 2010 at 12:30, Brian D. Eubanks <brian@buildsoftware.com>wrote:
>
> The project is currently focused on these tasks:
> 1. use Lojban text to add statements to knowledge bases, or ask questions
> about the content

2. describe data contained within knowledge bases, using Lojban text


[snippet B]


> SKOS is also an excellent way to implement a Semantic-Web-friendly
> thesaurus.


 [snippet C]

...using technologies such as in the links given (perhaps starting with a
> Wordnet implementation?) would bring great benefit to the Lojban community.



So the way I see it, there are really three separate tasks here: [A] lojban
knowledge authoring and extraction, [B] describing lojban using SKOS
and [C] describing Wordnet using lojban, all of which I think are brilliant
and awesome.

As for task [A], knowledge authoring and extraction, I think it has obvious
applications with the other two, but could stand alone: lojban is already
parseable (well, you know) so there's nothing stopping us from having an
online lojban submission form for entering parseable knowledge (that is, any
lojban text). In fact its frighteningly similar to the recent page
http://lojban.org/cgi-bin/corpus, but optimally I think there should be
parser validation upon submission. Searching it for completions of queries
is step two. Of the applications mentioned this one, i think, is the easiest
to implement: it's just a web-based parser [submission] and search
[extraction].

When I say knowledge extraction, I'm thinking something like this:

>> mi klama ma //input query
mi klama lo zarci
mi pu klama lo jarbu
... //output

Task [B], describing Lojban with SKOS, I think is slightly harder if done
properly, and very valuable, as it would extend the utility of [A].
Searching for { ma danlu } could give a list of all animals, and could also
search for all knowledge related to animals.

   - Start with the current TeX thesaurus from the lojban website.
   - Nice lojban uris. The way things work in Lojbanistan, it seems, is that
   the keys to the castle are available as soon as there is proof of concept.
   Once there's a queryable web interface, of the lojban corpus or some
   translated datastores, I'm sure Robin or whoever will let
   www.lojban.com/valsi#gismu be used for this. I have a domain name on a
   shared host if there's need for development space for proof of concept.
   - Immediately it's a perfect thesaurus format and so one
   trivial-to-implement but crucial-to-have feature is importing the latest
   jbovlaste xml dump for adding new gloss words as prefLabels etc.
   - References to other RDF schemas, not just wordnet [
   http://www.w3.org/TR/wordnet-rdf/]. For example, animals can be likewise
   translated to http://ontologi.es/biol/ns and linked to dbpedia pages,
   yadda yadda. This extends the domain of part A, as you put it,
   exponentially: open data would be searchable in a brief and robust form.
   This is what lojban dreams of when it says it may someday be used to talk to
   machines.

Now we could have input like:
>> lo cribe cu xabju ma //where do bears live
...//returns habitat information from all dbpedia entries referenced in
'cribe'

As for task [C], describing Wordnet synsets using lojban, I think this is
the hardest, not merely because of the sheer number of synsets, but because
of the mandate for precision-- synsets should have unique lojban terms.
Semantic granularity on WordNet is pretty small, and our mere seven thousand
or so current valsi are insufficient. For the part [B] task (referring to
other URIs that a single lojban word *could* describe), some degree of
polysemy/vagueness is fine, but for the task of describing wordnet, I think
the best implementation has a one-to-one correspondence to lojban words.

This obviously entails a huge number of ad-hoc lujvo/tanru (ideally) or
fi'uvla (suboptimal) that need to be created for similar shades of related
synsets, or just technical or cultural words that have no lojban equivalent
or approximate. I envision this only as feasible in a wiki-like, folksonomy
web frontend where multiple people can help assign new lujvo to unassigned
synsets. A great byproduct of this would be critical examination of the
shortcomings of current gismu, and possibly accelerated specification of
vague/poorly-defined terms and newly coined lojban terms.

---intermission---

I have a little proposal to distinguish these three tasks as I have
described them ([A],[B], and [C]). While they are all part of the same
project, I think they each play functionally distinct roles; namely a human
portal, an extendable description, and the holy grail of lojban
dictionaries. My proposal is that these three tasks be called jorne, sejorne
and tejorne respectively. jorne is still the name of the whole idea (since
the other two work through or enhance it), and

1) the 'connect' {lo jorne} refers to the user front end and basic
input/query portion
2) the 'things-connected' {lo se jorne} refers to the linked
searchable/translatable ontologies/schema
3) the 'connection types' {lo te jorne} refers to the clear definition of
synsets found in wordnet using lojban

Or, in laymans terms, a web front-end, a thesaurus and a dictionary.

Additionally, while the first two may only be of particular interest to
lojbanists, I think the third may have even greater implications as an
extension to Wordnet since it can do something Wordnet does not attempt to
do: provide cardinality information. That is, lets say we want to use
Wordnet to annotate a text unambiguously: that's totally possible. But if we
wanted to then search that text (say, looking for query completions) theres
no way to explicitly mark what lexeme falls in what argument number for a
multi-argument phrase. Lojban, however, makes every bridi immediately
accessible as a series of one or more triples for each sumti. Of course,
this application is light-years off and might just be crazytalk. But that's
why I would keep description of wordnet and description of lojban as two
distinct objectives.

co'o mi'e korbi

btw, are you located in the united states by any chance? you might be
interested in the North American Summer School in Logic, Language and
Information <http://www.indiana.edu/~nasslli/overview.html>.


>
> Quoting Oren <get.oren@gmail.com>:
>
>  The jorne page on sourceforge [http://jorne.sourceforge.net/] doesn't
>> mention OWL or appear to have any source code... is there a newer
>> specification or codebase that I'm missing? The PEG parser?
>>
>> As for the ideas proposed on the page, I still need to be sold. There
>> seems
>> to be overlap with the W3C incubator projects for representing Wordnet in
>> RDF/OWL http://www.w3.org/TR/wordnet-rdf/, and quite frankly, lojban's
>> minimal and prescriptive vocabulary doesn't seem to offer much application
>> here.
>>
>> Two separate overlapping W3C incubator projects seem to be more
>> appropriate
>> for semantic querying, Common Web Language (semantic representation)
>> http://www.w3.org/2005/Incubator/cwl/XGR-cwl-20080331/ and Emotional
>> Markup
>> Language http://www.w3.org/2005/Incubator/emotion/XGR-emotionml-20081120/
>> .
>>
>> Lojban, as a human language, can't offer what these robust proposals
>> describe-- that is, you can't really argue that lojban is any more
>> 'readable' than these languages, nor believe that it would be briefer or
>> more thorough; but it may be fun to try and define the entire lojban
>> vocabulary using these technologies. Or maybe that's what you meant all
>> along?
>>
>> <http://www.w3.org/TR/wordnet-rdf/>co'o mi'e korbi
>>
>>
>> On Thu, Apr 15, 2010 at 19:49, Brian Eubanks <brian@buildsoftware.com
>> >wrote:
>>
>>  Hi Oren,
>>>
>>> We corresponded last year about the Jorne (Lojban RDF) project I am
>>> trying
>>> to get started.
>>>
>>> The idea of using a Wordnet type approach is excellent. In fact, I would
>>> love to see a LojWordNet in association with the Jorne OWL mapping.
>>>
>>> Are you still interested in working on an OWL mapping for Lojban gismu?
>>> If
>>> so, I would like you to join the Sourceforge Jorne project. The growing
>>> amount of linked data makes this a great time to do this.
>>>
>>> I am working with the PEG parser to import simple sentences into an RDF
>>> triple store with the hope of converting between SPARQL and Lojban
>>> queries.
>>> My Lojban is not even baby talk level yet, which is where I could use
>>> your
>>> help too. I've been a lurker in the Lojban space but haven't spent time
>>> to
>>> learn it.
>>>
>>> Regards,
>>> Brian Eubanks
>>>
>>> Sent from my iPhone
>>>
>>>
>>> On Apr 15, 2010, at 3:52 AM, Oren <get.oren@gmail.com> wrote:
>>>
>>>  I like the idea of categories (or... tags!), I think the wiki is the
>>>
>>>> place for it to happen, and I also think we shouldn't start from
>>>> scratch. The thesaurus on the wiki page already segregates all gismu
>>>> into hierarchical categories. We can make a page template that allows
>>>> people to add "lujvo requests" to a category. A sister project to
>>>> consider would be fleshing out that same ontology with the existing
>>>> specialized lujvo lists and the lujvo flat file.
>>>>
>>>> I would also think that English/natlang glosses for the categories
>>>> should be optional while lojban section titles be mandatory and
>>>> default, for clarity.
>>>>
>>>> Back to the original topic of finding a minimal wordlist for a
>>>> dictionary, I think the real forward-thinking approach would be to
>>>> find some sufficiently open project similar to EuroWordNet [a
>>>> multilingual WordNet], and then extracting a set number of unique
>>>> *syslinks* (word senses), so that when we sit down to define 'spring'
>>>> we don't have to remember jumping, metal coils and le printemps all by
>>>> our erring-human selves.
>>>>
>>>> We could either use an arbitrary limit and go by frequency, and/or go
>>>> for all syslinks that contain an arbitrary number of constituent
>>>> languages. For example, only bother with 50% of all word senses that
>>>> appear in three or more languages.
>>>>
>>>> co'o mi'e korbi
>>>>
>>>> On Thu, Apr 15, 2010 at 15:13, Lindar <lindarthebard@yahoo.com> wrote:
>>>>
>>>>  My absolutely fantastic idea that donri/kribacr started and never
>>>>> finished (or never even started, but definitely came up before I
>>>>> thought of it [but it's still my idea]) is/was/will be to have groups
>>>>> of people select topics, and then go through and come up with as many
>>>>> words related to that topic as possible. I got this idea one day as I
>>>>> was sticking masking tape to pretty much everything around my
>>>>> apartment and writing the Lojban word for it in sharpie. I came across
>>>>> the simple fact that jvs didn't have words for "pot", "kitchen",
>>>>> "frying pan", etc., so I came up with words for them, and I think at
>>>>> least "kitchen" (jupku'a) is up there. I tried this again with
>>>>> computer terminology and it completely failed as nobody could agree
>>>>> properly on things (like "window", on which I still harshly/
>>>>> obnoxiously/rudely/insultingly disagree with xorxes).
>>>>>
>>>>> Rather than having one person sit through some big gehorsenshitfesten
>>>>> (parden my German) trying to pick out the most common concepts in the
>>>>> universe, why don't we use the wiki idea and create "conversational
>>>>> categories" under which we can place words (probably a lot of fu'ivla
>>>>> and lujvo) relevant to the topic. This will generate a much larger and
>>>>> relevant body of information, and it's a -much- less daunting task.
>>>>> For example, I am a recording engineer, so I would be likely to start
>>>>> a "recording technology" topic, and possibly contribute to the "music"
>>>>> topic as I would be more likely than anybody else to need/use words
>>>>> like "Hertz"/"kHz", "microphone", "nearfield monitors", "synthesizer",
>>>>> "MIDI", "mixing console", "bass", "treble", and I would probably be
>>>>> more qualified to determine what kind of terminology in Lojban is the
>>>>> most suitable. I'd also be fairly interested in the "kitchen and
>>>>> cooking" topic, and I think a great many a newbie would be very
>>>>> interested in the "household objects" topic, which would probably
>>>>> include a pointer to the "kitchen and cooking" topic and maybe even a
>>>>> "bathroom and hygiene" topic. This way people find what interests them
>>>>> and contribute to topics that they enjoy, which doesn't necessarily
>>>>> give an accurate picture of common usage based on an average through
>>>>> world cultures, but definitely gives a good sampling of words to use
>>>>> in conversation for the types of conversation that people learning
>>>>> Lojban would have. It works as a double edged sword (of handiness) in
>>>>> that we have people that are going to enjoy working because they're
>>>>> learning how to talk about things that interest them by contributing
>>>>> (which means things are more likely to get added, being that it's fun
>>>>> and not a chore) -AND- that we have quick 'topic reference'
>>>>> dictionaries so you can just leave the list open and peak through to
>>>>> make it easier to carry on conversations about what an arse your
>>>>> government leader is without having to poke through a list for ten
>>>>> minutes while the conversation has already passed because you wanted a
>>>>> word for "idiot" and jvs only had "stupid" as a gloss word for
>>>>> tolmencre. (Bad example, you get the picture.)
>>>>>
>>>>> Perhaps we can quickly brainstorm a few major topics just to have
>>>>> something up on a wiki?
>>>>>
>>>>> household items
>>>>> kitchen and cooking
>>>>> bathroom and hygiene
>>>>> sports and spectating
>>>>> automotive and driving
>>>>> computer ((hot topic, prone to arguments))
>>>>> music
>>>>> politics and law
>>>>> school and education
>>>>> work and the workplace
>>>>> friends and family
>>>>>
>>>>> The idea would be to have a big list of topics (and possibly
>>>>> subtopics), and on the pages of each we have brief glosses with Lojban
>>>>> words, with links to a page detailing the place structure, examples of
>>>>> usage, actual usage example if available, and potentially a relevant
>>>>> image (for those that learn by seeing and not reading).
>>>>>
>>>>> Perhaps under "household items" is "garage", and on the page for that
>>>>> it includes a little link for "see section: automotive and driving",
>>>>> and perhaps even "garage" is also located under "automotive and
>>>>> driving" or somesuch.
>>>>>
>>>>> Neatonifty idea, right?
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups
>>>>> "lojban" group.
>>>>> To post to this group, send email to lojban@googlegroups.com.
>>>>> To unsubscribe from this group, send email to
>>>>> lojban+unsubscribe@googlegroups.com<lojban%2Bunsubscribe@googlegroups.com>
>>>>> <lojban%2Bunsubscribe@googlegroups.com<lojban%252Bunsubscribe@googlegroups.com>
>>>>> >
>>>>>
>>>>> .
>>>>> For more options, visit this group at
>>>>> http://groups.google.com/group/lojban?hl=en.
>>>>>
>>>>>
>>>>>
>>>>>  --
>>>> You received this message because you are subscribed to the Google
>>>> Groups
>>>> "lojban" group.
>>>> To post to this group, send email to lojban@googlegroups.com.
>>>> To unsubscribe from this group, send email to
>>>> lojban+unsubscribe@googlegroups.com<lojban%2Bunsubscribe@googlegroups.com>
>>>> <lojban%2Bunsubscribe@googlegroups.com<lojban%252Bunsubscribe@googlegroups.com>
>>>> >
>>>>
>>>> .
>>>> For more options, visit this group at
>>>> http://groups.google.com/group/lojban?hl=en.
>>>>
>>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups
>>> "lojban" group.
>>> To post to this group, send email to lojban@googlegroups.com.
>>> To unsubscribe from this group, send email to
>>> lojban+unsubscribe@googlegroups.com<lojban%2Bunsubscribe@googlegroups.com>
>>> <lojban%2Bunsubscribe@googlegroups.com<lojban%252Bunsubscribe@googlegroups.com>
>>> >
>>>
>>> .
>>> For more options, visit this group at
>>> http://groups.google.com/group/lojban?hl=en.
>>>
>>>
>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "lojban" group.
>> To post to this group, send email to lojban@googlegroups.com.
>> To unsubscribe from this group, send email to
>> lojban+unsubscribe@googlegroups.com<lojban%2Bunsubscribe@googlegroups.com>
>> .
>> For more options, visit this group at
>> http://groups.google.com/group/lojban?hl=en.
>>
>>
>>
> --
> You received this message because you are subscribed to the Google Groups
> "lojban" group.
> To post to this group, send email to lojban@googlegroups.com.
> To unsubscribe from this group, send email to
> lojban+unsubscribe@googlegroups.com<lojban%2Bunsubscribe@googlegroups.com>
> .
> For more options, visit this group at
> http://groups.google.com/group/lojban?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups "lojban" group.
To post to this group, send email to lojban@googlegroups.com.
To unsubscribe from this group, send email to lojban+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/lojban?hl=en.


--00504502b172e9464f04850ed35a
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_quote">coi uiban</div><div class=3D"gm=
ail_quote"><br></div><div class=3D"gmail_quote">Sorry for not getting back =
to you sooner!=A0</div><div class=3D"gmail_quote"><br></div><div class=3D"g=
mail_quote">

[snippet A]</div><div class=3D"gmail_quote"><br></div><div class=3D"gmail_q=
uote">On Mon, Apr 19, 2010 at 12:30, Brian D. Eubanks <span dir=3D"ltr">&lt=
;<a href=3D"mailto:brian@buildsoftware.com">brian@buildsoftware.com</a>&gt;=
</span> wrote:<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;=
border-left:1px #ccc solid;padding-left:1ex;">


The project is currently focused on these tasks:<br>1. use Lojban text to a=
dd statements to knowledge bases, or ask questions about the content=A0</bl=
ockquote><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;borde=
r-left:1px #ccc solid;padding-left:1ex;">

2. describe data contained within knowledge bases, using Lojban text</block=
quote><div><br class=3D"Apple-interchange-newline">[snippet B]</div><div>=
=A0</div><blockquote class=3D"gmail_quote" style=3D"margin-top: 0px; margin=
-right: 0px; margin-bottom: 0px; margin-left: 0.8ex; border-left-width: 1px=
; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-=
left: 1ex; ">

SKOS is also an excellent way to implement a Semantic-Web-friendly thesauru=
s.</blockquote><div>=A0</div><div>=A0[snippet C]</div><div><br></div><block=
quote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc=
 solid;padding-left:1ex;">

...using technologies such as in the links given (perhaps starting with a W=
ordnet implementation?) would bring great benefit to the Lojban community.=
=A0</blockquote><div><br></div><div><br></div><div>So the way I see it, the=
re are really three separate tasks here: [A] lojban knowledge authoring and=
 extraction,=A0[B]=A0describing lojban using SKOS and=A0[C]=A0describing Wo=
rdnet using lojban,=A0all of which I think are brilliant and awesome.</div>

<div><br></div><div>As for task [A], knowledge authoring and extraction, I =
think it has obvious applications with the other two, but could stand alone=
: lojban is already parseable (well, you know) so there&#39;s nothing stopp=
ing us from having an online lojban submission form for entering parseable =
knowledge (that is, any lojban text). In fact its frighteningly similar to =
the recent page <a href=3D"http://lojban.org/cgi-bin/corpus">http://lojban.=
org/cgi-bin/corpus</a>, but optimally I think there should be parser valida=
tion upon submission. Searching it for completions of queries is step two. =
Of the applications mentioned this one, i think, is the easiest to implemen=
t: it&#39;s just a web-based parser [submission] and search [extraction].</=
div>

<div><br></div><div>When I say knowledge extraction, I&#39;m thinking somet=
hing like this:</div><div><br></div><div>&gt;&gt; mi klama ma //input query=
</div><div>mi klama lo zarci</div><div>mi pu klama lo jarbu</div><div>
... //output</div>
<div><br></div><div>Task [B], describing Lojban with SKOS, I think is sligh=
tly harder if done properly, and very valuable, as it would extend the util=
ity of [A]. Searching for { ma danlu } could give a list of all animals, an=
d could also search for all knowledge related to animals.=A0</div>

<div><ul><li>Start with the current TeX thesaurus from the lojban website.<=
/li><li>Nice lojban uris. The way things work in Lojbanistan, it seems, is =
that the keys to the castle are available as soon as there is proof of conc=
ept. Once there&#39;s a queryable web interface, of the lojban corpus or so=
me translated datastores, I&#39;m sure Robin or whoever will let <a href=3D=
"http://www.lojban.com/valsi#gismu">www.lojban.com/valsi#gismu</a> be used =
for this.=A0I have a domain name on a shared host if there&#39;s need for d=
evelopment space for proof of concept.</li>

<li>Immediately it&#39;s a perfect thesaurus format and so one trivial-to-i=
mplement but crucial-to-have feature is importing the latest jbovlaste xml =
dump for adding new gloss words as prefLabels etc.</li><li>References to ot=
her RDF schemas, not just wordnet [<a href=3D"http://www.w3.org/TR/wordnet-=
rdf/">http://www.w3.org/TR/wordnet-rdf/</a>]. For example, animals can be l=
ikewise translated to=A0<a href=3D"http://ontologi.es/biol/ns">http://ontol=
ogi.es/biol/ns</a> and linked to dbpedia pages, yadda yadda. This extends t=
he domain of part A, as you put it, exponentially: open data would be searc=
hable in a brief and robust form. This is what lojban dreams of when it say=
s it may someday be used to talk to machines.</li>

</ul><div>Now we could have input like:</div><div>&gt;&gt; lo cribe cu xabj=
u ma //where do bears live</div><div>...//returns habitat information from =
all dbpedia entries referenced in &#39;cribe&#39;</div><div><br></div>
<div>
As for task [C], describing Wordnet synsets using lojban, I think this is t=
he hardest, not merely because of the sheer number of synsets, but because =
of the mandate for precision-- synsets should have unique lojban terms. Sem=
antic granularity on WordNet is pretty small, and our mere seven thousand o=
r so current valsi are insufficient. For the part [B] task (referring to ot=
her URIs that a single lojban word *could* describe), some degree of polyse=
my/vagueness is fine, but for the task of describing wordnet, I think the b=
est implementation has a one-to-one correspondence to lojban words.</div>

</div><div><br></div><div>This obviously entails a huge number of ad-hoc lu=
jvo/tanru (ideally) or fi&#39;uvla (suboptimal) that need to be created for=
 similar shades of related synsets, or just technical or cultural words tha=
t have no lojban equivalent or approximate. I envision this only as feasibl=
e in a wiki-like, folksonomy web frontend where multiple people can help as=
sign new lujvo to unassigned synsets. A great byproduct of this would be cr=
itical examination of the shortcomings of current gismu, and possibly accel=
erated specification of vague/poorly-defined terms and newly coined lojban =
terms.</div>

<div><br></div><div>---intermission---</div><div><br></div><div>I have a li=
ttle proposal to distinguish these three tasks as I have described them ([A=
],[B], and [C]). While they are all part of the same project, I think they =
each play functionally distinct roles; namely a human portal, an extendable=
 description, and the holy grail of lojban dictionaries. My proposal is tha=
t these three tasks be called jorne, sejorne and tejorne respectively. jorn=
e is still the name of the whole idea (since the other two work through or =
enhance it), and</div>

<div><br></div><div>1) the &#39;connect&#39; {lo jorne} refers to the user =
front end and basic input/query portion=A0</div><div>2) the &#39;things-con=
nected&#39; {lo se jorne} refers to the linked searchable/translatable onto=
logies/schema</div>

<div>3) the &#39;connection types&#39; {lo te jorne} refers to the clear de=
finition of synsets found in wordnet using lojban</div><div><br></div><div>=
Or, in laymans terms, a web front-end, a thesaurus and a dictionary.</div>

<div><br></div><div>Additionally, while the first two may only be of partic=
ular interest to lojbanists, I think the third may have even greater implic=
ations as an extension to Wordnet since it can do something Wordnet does no=
t attempt to do: provide cardinality information. That is, lets say we want=
 to use Wordnet to annotate a text unambiguously: that&#39;s totally possib=
le. But if we wanted to then search that text (say, looking for query compl=
etions) theres no way to explicitly mark what lexeme falls in what argument=
 number for a multi-argument phrase. Lojban, however, makes every bridi imm=
ediately accessible as a series of one or more triples for each sumti. Of c=
ourse, this application is light-years off and might just be crazytalk. But=
 that&#39;s why I would keep description of wordnet and description of lojb=
an as two distinct objectives.</div>

<div><br></div><div>co&#39;o mi&#39;e korbi</div><div><br></div><div>btw, a=
re you located in the united states by any chance? you might be interested =
in the=A0<span class=3D"Apple-style-span" style=3D"font-family: arial, sans=
-serif; font-size: 10.8px; border-collapse: collapse; "><a href=3D"http://w=
ww.indiana.edu/~nasslli/overview.html" target=3D"_blank" style=3D"color: rg=
b(87, 151, 176); ">North American Summer School in Logic, Language and Info=
rmation</a>.=A0</span></div>

<div><br></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex=
;border-left:1px #ccc solid;padding-left:1ex;"><div class=3D"im"><br>
<br>
Quoting Oren &lt;<a href=3D"mailto:get.oren@gmail.com" target=3D"_blank">ge=
t.oren@gmail.com</a>&gt;:<br>
<br>
</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-l=
eft:1px #ccc solid;padding-left:1ex"><div class=3D"im">
The jorne page on sourceforge [<a href=3D"http://jorne.sourceforge.net/" ta=
rget=3D"_blank">http://jorne.sourceforge.net/</a>] doesn&#39;t<br>
mention OWL or appear to have any source code... is there a newer<br>
specification or codebase that I&#39;m missing? The PEG parser?<br>
<br>
As for the ideas proposed on the page, I still need to be sold. There seems=
<br>
to be overlap with the W3C incubator projects for representing Wordnet in<b=
r>
RDF/OWL <a href=3D"http://www.w3.org/TR/wordnet-rdf/" target=3D"_blank">htt=
p://www.w3.org/TR/wordnet-rdf/</a>, and quite frankly, lojban&#39;s<br>
minimal and prescriptive vocabulary doesn&#39;t seem to offer much applicat=
ion<br>
here.<br>
<br>
Two separate overlapping W3C incubator projects seem to be more appropriate=
<br>
for semantic querying, Common Web Language (semantic representation)<br>
<a href=3D"http://www.w3.org/2005/Incubator/cwl/XGR-cwl-20080331/" target=
=3D"_blank">http://www.w3.org/2005/Incubator/cwl/XGR-cwl-20080331/</a> and =
Emotional Markup<br>
Language <a href=3D"http://www.w3.org/2005/Incubator/emotion/XGR-emotionml-=
20081120/" target=3D"_blank">http://www.w3.org/2005/Incubator/emotion/XGR-e=
motionml-20081120/</a>.<br>
<br>
Lojban, as a human language, can&#39;t offer what these robust proposals<br=
>
describe-- that is, you can&#39;t really argue that lojban is any more<br>
&#39;readable&#39; than these languages, nor believe that it would be brief=
er or<br>
more thorough; but it may be fun to try and define the entire lojban<br>
vocabulary using these technologies. Or maybe that&#39;s what you meant all=
<br>
along?<br>
<br></div>
&lt;<a href=3D"http://www.w3.org/TR/wordnet-rdf/" target=3D"_blank">http://=
www.w3.org/TR/wordnet-rdf/</a>&gt;co&#39;o mi&#39;e korbi<div><div></div><d=
iv class=3D"h5"><br>
<br>
On Thu, Apr 15, 2010 at 19:49, Brian Eubanks &lt;<a href=3D"mailto:brian@bu=
ildsoftware.com" target=3D"_blank">brian@buildsoftware.com</a>&gt;wrote:<br=
>
<br>
</div></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;bo=
rder-left:1px #ccc solid;padding-left:1ex"><div><div></div><div class=3D"h5=
">
Hi Oren,<br>
<br>
We corresponded last year about the Jorne (Lojban RDF) project I am trying<=
br>
to get started.<br>
<br>
The idea of using a Wordnet type approach is excellent. In fact, I would<br=
>
love to see a LojWordNet in association with the Jorne OWL mapping.<br>
<br>
Are you still interested in working on an OWL mapping for Lojban gismu? If<=
br>
so, I would like you to join the Sourceforge Jorne project. The growing<br>
amount of linked data makes this a great time to do this.<br>
<br>
I am working with the PEG parser to import simple sentences into an RDF<br>
triple store with the hope of converting between SPARQL and Lojban queries.=
<br>
My Lojban is not even baby talk level yet, which is where I could use your<=
br>
help too. I&#39;ve been a lurker in the Lojban space but haven&#39;t spent =
time to<br>
learn it.<br>
<br>
Regards,<br>
Brian Eubanks<br>
<br>
Sent from my iPhone<br>
<br>
<br>
On Apr 15, 2010, at 3:52 AM, Oren &lt;<a href=3D"mailto:get.oren@gmail.com"=
 target=3D"_blank">get.oren@gmail.com</a>&gt; wrote:<br>
<br>
=A0I like the idea of categories (or... tags!), I think the wiki is the<br>
</div></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;bo=
rder-left:1px #ccc solid;padding-left:1ex"><div><div></div><div class=3D"h5=
">
place for it to happen, and I also think we shouldn&#39;t start from<br>
scratch. The thesaurus on the wiki page already segregates all gismu<br>
into hierarchical categories. We can make a page template that allows<br>
people to add &quot;lujvo requests&quot; to a category. A sister project to=
<br>
consider would be fleshing out that same ontology with the existing<br>
specialized lujvo lists and the lujvo flat file.<br>
<br>
I would also think that English/natlang glosses for the categories<br>
should be optional while lojban section titles be mandatory and<br>
default, for clarity.<br>
<br>
Back to the original topic of finding a minimal wordlist for a<br>
dictionary, I think the real forward-thinking approach would be to<br>
find some sufficiently open project similar to EuroWordNet [a<br>
multilingual WordNet], and then extracting a set number of unique<br>
*syslinks* (word senses), so that when we sit down to define &#39;spring=
9;<br>
we don&#39;t have to remember jumping, metal coils and le printemps all by<=
br>
our erring-human selves.<br>
<br>
We could either use an arbitrary limit and go by frequency, and/or go<br>
for all syslinks that contain an arbitrary number of constituent<br>
languages. For example, only bother with 50% of all word senses that<br>
appear in three or more languages.<br>
<br>
co&#39;o mi&#39;e korbi<br>
<br>
On Thu, Apr 15, 2010 at 15:13, Lindar &lt;<a href=3D"mailto:lindarthebard@y=
ahoo.com" target=3D"_blank">lindarthebard@yahoo.com</a>&gt; wrote:<br>
<br>
</div></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;bo=
rder-left:1px #ccc solid;padding-left:1ex"><div><div></div><div class=3D"h5=
">
My absolutely fantastic idea that donri/kribacr started and never<br>
finished (or never even started, but definitely came up before I<br>
thought of it [but it&#39;s still my idea]) is/was/will be to have groups<b=
r>
of people select topics, and then go through and come up with as many<br>
words related to that topic as possible. I got this idea one day as I<br>
was sticking masking tape to pretty much everything around my<br>
apartment and writing the Lojban word for it in sharpie. I came across<br>
the simple fact that jvs didn&#39;t have words for &quot;pot&quot;, &quot;k=
itchen&quot;,<br>
&quot;frying pan&quot;, etc., so I came up with words for them, and I think=
 at<br>
least &quot;kitchen&quot; (jupku&#39;a) is up there. I tried this again wit=
h<br>
computer terminology and it completely failed as nobody could agree<br>
properly on things (like &quot;window&quot;, on which I still harshly/<br>
obnoxiously/rudely/insultingly disagree with xorxes).<br>
<br>
Rather than having one person sit through some big gehorsenshitfesten<br>
(parden my German) trying to pick out the most common concepts in the<br>
universe, why don&#39;t we use the wiki idea and create &quot;conversationa=
l<br>
categories&quot; under which we can place words (probably a lot of fu&#39;i=
vla<br>
and lujvo) relevant to the topic. This will generate a much larger and<br>
relevant body of information, and it&#39;s a -much- less daunting task.<br>
For example, I am a recording engineer, so I would be likely to start<br>
a &quot;recording technology&quot; topic, and possibly contribute to the &q=
uot;music&quot;<br>
topic as I would be more likely than anybody else to need/use words<br>
like &quot;Hertz&quot;/&quot;kHz&quot;, &quot;microphone&quot;, &quot;nearf=
ield monitors&quot;, &quot;synthesizer&quot;,<br>
&quot;MIDI&quot;, &quot;mixing console&quot;, &quot;bass&quot;, &quot;trebl=
e&quot;, and I would probably be<br>
more qualified to determine what kind of terminology in Lojban is the<br>
most suitable. I&#39;d also be fairly interested in the &quot;kitchen and<b=
r>
cooking&quot; topic, and I think a great many a newbie would be very<br>
interested in the &quot;household objects&quot; topic, which would probably=
<br>
include a pointer to the &quot;kitchen and cooking&quot; topic and maybe ev=
en a<br>
&quot;bathroom and hygiene&quot; topic. This way people find what interests=
 them<br>
and contribute to topics that they enjoy, which doesn&#39;t necessarily<br>
give an accurate picture of common usage based on an average through<br>
world cultures, but definitely gives a good sampling of words to use<br>
in conversation for the types of conversation that people learning<br>
Lojban would have. It works as a double edged sword (of handiness) in<br>
that we have people that are going to enjoy working because they&#39;re<br>
learning how to talk about things that interest them by contributing<br>
(which means things are more likely to get added, being that it&#39;s fun<b=
r>
and not a chore) -AND- that we have quick &#39;topic reference&#39;<br>
dictionaries so you can just leave the list open and peak through to<br>
make it easier to carry on conversations about what an arse your<br>
government leader is without having to poke through a list for ten<br>
minutes while the conversation has already passed because you wanted a<br>
word for &quot;idiot&quot; and jvs only had &quot;stupid&quot; as a gloss w=
ord for<br>
tolmencre. (Bad example, you get the picture.)<br>
<br>
Perhaps we can quickly brainstorm a few major topics just to have<br>
something up on a wiki?<br>
<br>
household items<br>
kitchen and cooking<br>
bathroom and hygiene<br>
sports and spectating<br>
automotive and driving<br>
computer ((hot topic, prone to arguments))<br>
music<br>
politics and law<br>
school and education<br>
work and the workplace<br>
friends and family<br>
<br>
The idea would be to have a big list of topics (and possibly<br>
subtopics), and on the pages of each we have brief glosses with Lojban<br>
words, with links to a page detailing the place structure, examples of<br>
usage, actual usage example if available, and potentially a relevant<br>
image (for those that learn by seeing and not reading).<br>
<br>
Perhaps under &quot;household items&quot; is &quot;garage&quot;, and on the=
 page for that<br>
it includes a little link for &quot;see section: automotive and driving&quo=
t;,<br>
and perhaps even &quot;garage&quot; is also located under &quot;automotive =
and<br>
driving&quot; or somesuch.<br>
<br>
Neatonifty idea, right?<br>
<br>
--<br>
You received this message because you are subscribed to the Google Groups<b=
r>
&quot;lojban&quot; group.<br>
To post to this group, send email to <a href=3D"mailto:lojban@googlegroups.=
com" target=3D"_blank">lojban@googlegroups.com</a>.<br>
To unsubscribe from this group, send email to<br>
</div></div><a href=3D"mailto:lojban%2Bunsubscribe@googlegroups.com" target=
=3D"_blank">lojban+unsubscribe@googlegroups.com</a>&lt;<a href=3D"mailto:lo=
jban%252Bunsubscribe@googlegroups.com" target=3D"_blank">lojban%2Bunsubscri=
be@googlegroups.com</a>&gt;<div class=3D"im">

<br>
.<br>
For more options, visit this group at<br>
<a href=3D"http://groups.google.com/group/lojban?hl=3Den" target=3D"_blank"=
>http://groups.google.com/group/lojban?hl=3Den</a>.<br>
<br>
<br>
<br>
</div></blockquote><div class=3D"im">
--<br>
You received this message because you are subscribed to the Google Groups<b=
r>
&quot;lojban&quot; group.<br>
To post to this group, send email to <a href=3D"mailto:lojban@googlegroups.=
com" target=3D"_blank">lojban@googlegroups.com</a>.<br>
To unsubscribe from this group, send email to<br>
</div><a href=3D"mailto:lojban%2Bunsubscribe@googlegroups.com" target=3D"_b=
lank">lojban+unsubscribe@googlegroups.com</a>&lt;<a href=3D"mailto:lojban%2=
52Bunsubscribe@googlegroups.com" target=3D"_blank">lojban%2Bunsubscribe@goo=
glegroups.com</a>&gt;<div class=3D"im">

<br>
.<br>
For more options, visit this group at<br>
<a href=3D"http://groups.google.com/group/lojban?hl=3Den" target=3D"_blank"=
>http://groups.google.com/group/lojban?hl=3Den</a>.<br>
<br>
</div></blockquote><div class=3D"im">
<br>
--<br>
You received this message because you are subscribed to the Google Groups<b=
r>
&quot;lojban&quot; group.<br>
To post to this group, send email to <a href=3D"mailto:lojban@googlegroups.=
com" target=3D"_blank">lojban@googlegroups.com</a>.<br>
To unsubscribe from this group, send email to<br>
</div><a href=3D"mailto:lojban%2Bunsubscribe@googlegroups.com" target=3D"_b=
lank">lojban+unsubscribe@googlegroups.com</a>&lt;<a href=3D"mailto:lojban%2=
52Bunsubscribe@googlegroups.com" target=3D"_blank">lojban%2Bunsubscribe@goo=
glegroups.com</a>&gt;<div class=3D"im">

<br>
.<br>
For more options, visit this group at<br>
<a href=3D"http://groups.google.com/group/lojban?hl=3Den" target=3D"_blank"=
>http://groups.google.com/group/lojban?hl=3Den</a>.<br>
<br>
<br>
</div></blockquote><div class=3D"im">
<br>
--<br>
You received this message because you are subscribed to the Google Groups &=
quot;lojban&quot; group.<br>
To post to this group, send email to <a href=3D"mailto:lojban@googlegroups.=
com" target=3D"_blank">lojban@googlegroups.com</a>.<br>
To unsubscribe from this group, send email to <a href=3D"mailto:lojban%2Bun=
subscribe@googlegroups.com" target=3D"_blank">lojban+unsubscribe@googlegrou=
ps.com</a>.<br>
For more options, visit this group at <a href=3D"http://groups.google.com/g=
roup/lojban?hl=3Den" target=3D"_blank">http://groups.google.com/group/lojba=
n?hl=3Den</a>.<br>
<br>
<br>
</div></blockquote><div><div></div><div class=3D"h5">
<br>
-- <br>
You received this message because you are subscribed to the Google Groups &=
quot;lojban&quot; group.<br>
To post to this group, send email to <a href=3D"mailto:lojban@googlegroups.=
com" target=3D"_blank">lojban@googlegroups.com</a>.<br>
To unsubscribe from this group, send email to <a href=3D"mailto:lojban%2Bun=
subscribe@googlegroups.com" target=3D"_blank">lojban+unsubscribe@googlegrou=
ps.com</a>.<br>
For more options, visit this group at <a href=3D"http://groups.google.com/g=
roup/lojban?hl=3Den" target=3D"_blank">http://groups.google.com/group/lojba=
n?hl=3Den</a>.<br>
<br>
</div></div></blockquote></div><br></div>

<p></p>

-- <br />
You received this message because you are subscribed to the Google Groups "=
lojban" group.<br />
To post to this group, send email to lojban@googlegroups.com.<br />
To unsubscribe from this group, send email to lojban+unsubscribe@googlegrou=
ps.com.<br />

For more options, visit this group at http://groups.google.com/group/lojban=
?hl=3Den.<br />



--00504502b172e9464f04850ed35a--