Received: from mail-ea0-f188.google.com ([209.85.215.188]:43635) by stodi.digitalkingdom.org with esmtps (TLSv1:RC4-SHA:128) (Exim 4.76) (envelope-from ) id 1TusSl-0003HJ-7W; Mon, 14 Jan 2013 14:25:16 -0800 Received: by mail-ea0-f188.google.com with SMTP id k11sf1586670eaa.25 for ; Mon, 14 Jan 2013 14:24:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20120806; h=x-received:x-beenthere:x-received:x-received:received-spf :mime-version:in-reply-to:references:date:message-id:subject:from:to :x-original-sender:x-original-authentication-results:reply-to :precedence:mailing-list:list-id:x-google-group-id:list-post :list-help:list-archive:sender:list-subscribe:list-unsubscribe :content-type; bh=Zl17EgkuAiP+MTXY9KKwud02rbMbnUposZWHPxqd7gI=; b=OY5bI7Qz50bJmkHi83rtPCQe8IlE2vLwb5jHv/aaUilUfSjf3m8ooIlIjdAan8RwFP fwIu87+7vUlZXOsaKSbJdhFNldmoGWICGDpL3k7UtlVJ8jsmiWIWdOd0M8LMW+k23KkU Oi/ykDUbxP8J427CkbOTBjpm2vW7tpRwY9QrD5bGrgQ7cXUwpinS6mGsGhTw+E79shER bMwyROJTcNnCIJRgFsWn5aDmfTcPyqapZwtWR+85/PlZTVYgPVKbHszYW7JTkHQzjV4Y MO6VjL3/PVSL078asoet5EdH2No2C3yFdqZWCN9RaMJKIqNkFIPzFDMBxkLXdwhuqVma HqYA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:x-beenthere:x-received:x-received:received-spf :mime-version:in-reply-to:references:date:message-id:subject:from:to :x-original-sender:x-original-authentication-results:reply-to :precedence:mailing-list:list-id:x-google-group-id:list-post :list-help:list-archive:sender:list-subscribe:list-unsubscribe :content-type; bh=Zl17EgkuAiP+MTXY9KKwud02rbMbnUposZWHPxqd7gI=; b=SDmygGIcZANlvJY/028j5UO4MK5RNSKilgDkt0sMTsiNHLNEhlC5JpvI0TBe0/wx2O w1ze6d30Y2npct6NhB46BYcpL/0wuB1eORPTc1pFtKbQkJTAgULJTINck+Tybx+YtHfq ZshmRUEkUmrg4XoH911ct8sPVJEqFnDzDdysMCMoIpNRsnpup1/4NTnSCwwFhADDiKcq LePT7v7WJjT1Z1AnGN5ABBAdrJtq6YrX7b6ZGolaX6+6ju70tj/EZEzeqnw1XI+j5afh o/Su2Q2er9XDO7PwhBY5Zy7cojeJkh8PALTl8mxvWPKlPLKz32AxBDQGXfYEmWy0U5mP uAOg== X-Received: by 10.180.90.107 with SMTP id bv11mr30088wib.0.1358202295640; Mon, 14 Jan 2013 14:24:55 -0800 (PST) X-BeenThere: lojban@googlegroups.com Received: by 10.181.13.235 with SMTP id fb11ls1661846wid.2.canary; Mon, 14 Jan 2013 14:24:54 -0800 (PST) X-Received: by 10.14.179.198 with SMTP id h46mr108521758eem.4.1358202294918; Mon, 14 Jan 2013 14:24:54 -0800 (PST) X-Received: by 10.14.179.198 with SMTP id h46mr108521757eem.4.1358202294883; Mon, 14 Jan 2013 14:24:54 -0800 (PST) Received: from mail-ee0-f45.google.com (mail-ee0-f45.google.com [74.125.83.45]) by gmr-mx.google.com with ESMTPS id z44si5385958een.0.2013.01.14.14.24.54 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 14 Jan 2013 14:24:54 -0800 (PST) Received-SPF: pass (google.com: domain of so.cool.ogi@gmail.com designates 74.125.83.45 as permitted sender) client-ip=74.125.83.45; Received: by mail-ee0-f45.google.com with SMTP id d49so2195533eek.18 for ; Mon, 14 Jan 2013 14:24:54 -0800 (PST) MIME-Version: 1.0 Received: by 10.14.223.200 with SMTP id v48mr235035880eep.24.1358202294734; Mon, 14 Jan 2013 14:24:54 -0800 (PST) Received: by 10.14.127.7 with HTTP; Mon, 14 Jan 2013 14:24:54 -0800 (PST) In-Reply-To: <2881164.qORC9BWpb7@caracal> References: <88bd2464-6efd-4252-9ff6-7c4f68f79e89@googlegroups.com> <2881164.qORC9BWpb7@caracal> Date: Mon, 14 Jan 2013 23:24:54 +0100 Message-ID: Subject: Re: [lojban] criteria for the dictionary From: =?ISO-8859-1?Q?Sebastian_Fr=F6jd?= To: lojban@googlegroups.com X-Original-Sender: so.cool.ogi@gmail.com X-Original-Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of so.cool.ogi@gmail.com designates 74.125.83.45 as permitted sender) smtp.mail=so.cool.ogi@gmail.com; dkim=pass header.i=@gmail.com Reply-To: lojban@googlegroups.com Precedence: list Mailing-list: list lojban@googlegroups.com; contact lojban+owners@googlegroups.com List-ID: X-Google-Group-Id: 1004133512417 List-Post: , List-Help: , List-Archive: Sender: lojban@googlegroups.com List-Subscribe: , List-Unsubscribe: , Content-Type: multipart/alternative; boundary=047d7b6228204f8e5704d3471d6b X-Spam-Score: -0.1 (/) X-Spam_score: -0.1 X-Spam_score_int: 0 X-Spam_bar: / --047d7b6228204f8e5704d3471d6b Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable 2013/1/14 Pierre Abbat > On Monday, January 14, 2013 04:55:12 jongausib wrote: > > If there's someday is going to be a complete, official lojban > dictionary, I > > think there's a need for some criteria for what jbovlaste should contai= n > > and in which form. > > > > Right now the dictionary is rather finite, but with more contributors i= t > > could expand to an extreme extent. > > I think it's a good idea to discuss this issue now, so I don't contribu= te > > with a lot of valsi now, and then a few years later someone delete a lo= t > of > > my work, because they don't fit into some future official template or > list > > of criteria. > > > > *Vocabulary* > > 1. Should we try to add lujvo for all places of each gismu as distinct > > valsi, like {seldri}, {selbai}, {terni'i} etc? > > Only if they are glossed as different words, such as "tervecnu". We don't > need > lots of entries for "species of ". > > > 2. What kind of cmene/cmevla should be added? (with no restriction this > set > > could be extremely large) > > Names of countries, cities, oblasti, cantons, etc. > Names of diseases. > Names of well-known people (including Lojbanists who are well-known among > Lojbanists but may not be well-known to the world). > Given names (there may be more than one form of a given name). > Probably some others. I just added "relcibjolmib", a few days after > hearing la > .camgusmis. talk about it (in English). > > A few names can't be entered because they're two or more words (e.g. > "kot.divu=E1r"). > Hm.. why not add a new cathegory called "cmene cluster" for names consisting of more than one word? > > > For example, we could add recommendation that you only should add cmene > > that could be regarded as having a lexicographical value, like the most > > common names of persons, companies, geographical entities etc. Not the > name > > of the street where you are living and shit like that. > > > > 3. What kind of fu'ivla should be added? > > > > With ALL names of species and chemical substances and other large sets, > we > > are going to have a very huge dictionary. > > I've been trying to translate some names of species into lujvo (the > > solution I prefer), but the latin names are often not very descriptive > > and/or logical, so I think one of the better solution (at least for nam= es > > of species etc, you use relative often) is to just lojbanize the latin > > names into fu'ivla. > > > > You'll probably already discussed this a lot, but it would be nice to > have > > some guidelines documented somewhere about standards. I believe lojban > > standards about biology, chemistry, music theory and other scientific > > disciplines, doesn't belong to the official grammar of lojban (as littl= e > as > > Oxford style manual is normative for ALL kind of English language), but > > still it would be nice to have such guidelines (on a level below the > > official language). Especially jbovlaste need such guidelines if we don= 't > > want to have an inconsistent dictionary with a dukse of words in a > possible > > future. > > You'll have trouble entering many species names, because they're at least > three words, such as "cionmau la barda" or "maxri la .durum.". Go for > genera > and up. But don't try to enter every single genus of fish until there are > Lojbanic ichthylogists who would need to use them. I've added a few such = as > "skomberu". > > Languages and ethnicities: there are only a few thousand of these, so > entering > a few hundred wouldn't overload the dictionary. I'd enter "pintupi" (whic= h > I've mentioned) and maybe "olkola" (which is a valid fu'ivla), but not > "Oykangand" (a language closely related to Olkola) until someone figured > out > the right way to Lojbanize it. > > Chemicals: we need to figure out the proper way to Lojbanize IUPAC before > entering IUPAC lujvo. Simple-named chemicals, such as geosmin (derpanxu'i= ) > or > capsaicin (xumrkapsiku), can be entered already. 1,1,1-trichloro-2,2-di(4= - > chlorophenyl)ethane will have to wait. Numbers are used three ways in > chemical > names (the other is to indicate an oxidation state or valency), and we ha= ve > just the one set of numbers to use in lujvo. I've proposed some > experimental > gismu for use in chemical names, such as "xudvu" (aldehyde). > I've been wondering about IUPAC-terminology and lojban. Wouldn't it be possible to use lojban features as {joi}, {ke}, {xi}, {mei} within the rules of the IUPAC system somehow to have lujvos like 1,1,1-trichloro-2,2-di(4-chlorophenyl)ethane ? > > > 4. When is it ok to add a stage-4 fui'vla in the dictionary? > > > > I know some lojbanist don't like stage-3 fu'ivla. I do like stage-3. Th= e > > prefix in the stage-3 fu'ivla help you understand a little what this > > foreign word is about. And you could make distinctions easily between f= or > > example {spatrvanila}, {grutrvanila} "vanilla pod", {tsijrvanila} > "vanilla > > seed" and {xukmrvanila} vanillin. > > > > The only stage-4 fu'ivla I add are those which are very cultural > specific, > > not easily constructed as a lujvo and/or which doesn't easily fit into > some > > cathegory. Stage-4 fu'ivla should also be useful. CLL says: "[stage-4] > are > > used where a fu'ivla has become so common or so important that it must = be > > made as short as possible." > > > > But as long as you don't add stage-4 without cause (what's the cause of > > making {konjaku} a stage-4 for example? I've never heard of this specie= s > > before), I think those fu'ivlas could really give a good flavor to the > > language, even if this at the same time means that we're going to learn= a > > lot of inconsistent words just like learning natlangs. But stage-4 > fu'ivlas > > could be really cool, my favourites are {iklki} and {fi'ikca}. > > Konjac is a common ingredient in Japanese cuisine. Ok, then I add se'emla= . > Common pastry in Sweden. > > For words that fit into the type-4 format without too much squishing, I > don't > see anything wrong with using type 4 to begin with, except where the type= -4 > could easily be interpreted as two unrelated things. For instance, I > wouldn't > use the word "malpigi", as it could equally well mean acerola > (rutrmalpigi) or > an insect's excretory organ (ragrmalpigi), both named for Marcello > Malpighi. > This turned out to happen with "konjaku" (someone thought it's cognac, > which > is koinka), but I didn't find out until after I entered it, as I was > thinking > of the Japanese word, which is unrelated to the French word. > > I think "tcigaso" should be used as type-4 already. Most people with cars > use > it. > > > *Form* > > I think jbovlaste should have a consistent format before publishing a > > printed version. Some poor fellow would therefore have to read through > all > > jbovlaste and edit it into a consistent format just before printing. Bu= t > if > > we would have guidelines from now on already, and we all add valsi in t= he > > same way, there are going to be less work for someone in the future. > > > > 1. Form of definition > > Which format do you think should be standard? > > > > {nerkla}: > > a. n1=3Dk1 enters n2=3Dk2 from origin k3 via route k4 using means/vehic= le k5 > > > > b.x1=3Dn1=3Dk1 enters x2=3Dn2=3Dk2 from origin x3=3Dk3 via route x4=3Dk= 4 using > > means/vehicle x5=3Dk5 > > > > c. x1 enters x2 from origin x3 via route x4 using means/vehicle x5 > > a for lujvo where the arguments are in order, b where they are not in > order, c > for fu'ivla. > Sounds like a good rule. > > > 2. Etymology > > I suggest that we don't add etymology info in the notes, but use the "a= dd > > etymology"-link in jbovlaste. > > I agree. That's what it's there for. > > > I think etymology should be mandatory for cmevla and fu'ivla, so you ca= n > > discuss which language to borrow from. > > This is a paranthetical but important question if lojban has ambition t= o > be > > as cultural neutral as possible. > > So one recommendation could be that you always use latin for names of > > species, the language most related to the specific cultural > > phenomena/object (or a derivate of languages if many cultures share the > > same phenomena/object, or in that case maybe esperanto). > > I wouldn't always use Latin for species. "skomberu", "polgosu", "sperlanu= ", > "merlanu", and "merluci" are all from Greek, Latin, or some descendant > thereof > (though "sperlanu" has a Germanic root), but for the capelin, an importan= t > forage fish that circles Iceland, I picked the Icelandic word as a source= . > > > 3. How much info in the notes? > > > > And also a final question: Is it possible for a user to edit another > user's > > notes in jbovlaste, to add info? > > It is possible for one user to edit another's definition, but should be > done > sparingly. Jbovlaste isn't Wiktionary. > > Speaking of Wiktionary, there are Wiktionaries in English, French, Lojban= , > and > other languages. The English Wiktionary, for any modern language (includi= ng > Lojban), requires that three different people agree on a word at least a > year > ago, or it appear in some well-known work. The Lojban Wiktionary doesn't, > and > the French Wiktionary accepts Tsolyani words, which the English Wiktionar= y > doesn't. You can enter phrases like "lo xamgu ko li'i" in Wiktionary, but > not > jbovlaste. > > Pierre > -- > La sal en el mar es m=E1s que en la sangre. > Le sel dans la mer est plus que dans le sang. > > -- > You received this message because you are subscribed to the Google Groups > "lojban" group. > To post to this group, send email to lojban@googlegroups.com. > To unsubscribe from this group, send email to > lojban+unsubscribe@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/lojban?hl=3Den. > > --=20 You received this message because you are subscribed to the Google Groups "= lojban" group. To post to this group, send email to lojban@googlegroups.com. To unsubscribe from this group, send email to lojban+unsubscribe@googlegrou= ps.com. For more options, visit this group at http://groups.google.com/group/lojban= ?hl=3Den. --047d7b6228204f8e5704d3471d6b Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable

2013/1/14 Pierre Abbat <phma@bezitopo.o= rg>
On Monday, January 14, 2013 04:55:12 jongausib wrote:
> If there's someday is going to be a complete, official lojban dict= ionary, I
> think there's a need for some criteria for what jbovlaste should c= ontain
> and in which form.
>
> Right now the dictionary is rather finite, but with more contributors = it
> could expand to an extreme extent.
> I think it's a good idea to discuss this issue now, so I don't= contribute
> with a lot of valsi now, and then a few years later someone delete a l= ot of
> my work, because they don't fit into some future official template= or list
> of criteria.
>
> *Vocabulary*
> 1. Should we try to add lujvo for all places of each= gismu as distinct
> valsi, like {seldri}, {selbai}, {terni'i} etc?

Only if they are glossed as different words, such as "tervecnu&q= uot;. We don't need
lots of entries for "species of <animal>".

> 2. What kind of cmene/cmevla should be added? (with no restriction thi= s set
> could be extremely large)

Names of countries, cities, oblasti, cantons, etc.
Names of diseases.
Names of well-known people (including Lojbanists who are well-known among Lojbanists but may not be well-known to the world).
Given names (there may be more than one form of a given name).
Probably some others. I just added "relcibjolmib", a few days aft= er hearing la
.camgusmis. talk about it (in English).

A few names can't be entered because they're two or more words (e.g= .
"kot.divu=E1r").

Hm.. why not add a new = cathegory called "cmene cluster" for names consisting of more tha= n one word?

> For example, we could add recommendation that you only should add cmen= e
> that could be regarded as having a lexicographical value, like the mos= t
> common names of persons, companies, geographical entities etc. Not the= name
> of the street where you are living and shit like that.
>
> 3. What kind of fu'ivla should be added?
>
> With ALL names of species and chemical substances and other large sets= , we
> are going to have a very huge dictionary.
> I've been trying to translate some names of species into lujvo (th= e
> solution I prefer), but the latin names are often not very descriptive=
> and/or logical, so I think one of the better solution (at least for na= mes
> of species etc, you use relative often) is to just lojbanize the latin=
> names into fu'ivla.
>
> You'll probably already discussed this a lot, but it would be nice= to have
> some guidelines documented somewhere about standards. I believe lojban=
> standards about biology, chemistry, music theory and other scientific<= br> > disciplines, doesn't belong to the official grammar of lojban (as = little as
> Oxford style manual is normative for ALL kind of English language), bu= t
> still it would be nice to have such guidelines (on a level below the > official language). Especially jbovlaste need such guidelines if we do= n't
> want to have an inconsistent dictionary with a dukse of words in a pos= sible
> future.

You'll have trouble entering many species names, because they'= ;re at least
three words, such as "cionmau la barda" or "maxri la .durum.= ". Go for genera
and up. But don't try to enter every single genus of fish until there a= re
Lojbanic ichthylogists who would need to use them. I've added a few suc= h as
"skomberu".

Languages and ethnicities: there are only a few thousand of these, so enter= ing
a few hundred wouldn't overload the dictionary. I'd enter "pin= tupi" (which
I've mentioned) and maybe "olkola" (which is a valid fu'i= vla), but not
"Oykangand" (a language closely related to Olkola) until someone = figured out
the right way to Lojbanize it.

Chemicals: we need to figure out the proper way to Lojbanize IUPAC before entering IUPAC lujvo. Simple-named chemicals, such as geosmin (derpanxu'= ;i) or
capsaicin (xumrkapsiku), can be entered already. 1,1,1-trichloro-2,2-di(4-<= br> chlorophenyl)ethane will have to wait. Numbers are used three ways in chemi= cal
names (the other is to indicate an oxidation state or valency), and we have=
just the one set of numbers to use in lujvo. I've proposed some experim= ental
gismu for use in chemical names, such as "xudvu" (aldehyde).
<= /blockquote>

I've been wondering about IUPAC-terminology and lo= jban. Wouldn't it be possible to use lojban features as {joi}, {ke}, {x= i}, {mei} within the rules of the IUPAC system somehow to have lujvos like = =A0 1,1,1-trichloro-2,2-di(4-chlorophenyl)ethane ?

> 4. When is it ok to add a stage-4 fui'vla in the dictionary?
>
> I know some lojbanist don't like stage-3 fu'ivla. I do like st= age-3. The
> prefix in the stage-3 fu'ivla help you understand a little what th= is
> foreign word is about. And you could make distinctions easily between = for
> example {spatrvanila}, {grutrvanila} "vanilla pod", {tsijrva= nila} "vanilla
> seed" and {xukmrvanila} vanillin.
>
> The only stage-4 fu'ivla I add are those which are very cultural s= pecific,
> not easily constructed as a lujvo and/or which doesn't easily fit = into some
> cathegory. Stage-4 fu'ivla should also be useful. CLL says: "= [stage-4] are
> used where a fu'ivla has become so common or so important that it = must be
> made as short as possible."
>
> But as long as you don't add stage-4 without cause (what's the= cause of
> making {konjaku} a stage-4 for example? I've never heard of this s= pecies
> before), I think those fu'ivlas could really give a good flavor to= the
> language, even if this at the same time means that we're going to = learn a
> lot of inconsistent words just like learning natlangs. But stage-4 fu&= #39;ivlas
> could be really cool, my favourites are {iklki} and {fi'ikca}.

Konjac is a common ingredient in Japanese cuisine. Ok, then I add se&= #39;emla. Common pastry in Sweden.

For words that fit into the type-4 format without too much squishing, I don= 't
see anything wrong with using type 4 to begin with, except where the type-4=
could easily be interpreted as two unrelated things. For instance, I wouldn= 't
use the word "malpigi", as it could equally well mean acerola (ru= trmalpigi) or
an insect's excretory organ (ragrmalpigi), both named for Marcello Malp= ighi.
This turned out to happen with "konjaku" (someone thought it'= s cognac, which
is koinka), but I didn't find out until after I entered it, as I was th= inking
of the Japanese word, which is unrelated to the French word.

I think "tcigaso" should be used as type-4 already. Most people w= ith cars use
it.

> *Form*
> I think jbovlaste should have a consistent format be= fore publishing a
> printed version. Some poor fellow would therefore have to read through= all
> jbovlaste and edit it into a consistent format just before printing. B= ut if
> we would have guidelines from now on already, and we all add valsi in = the
> same way, there are going to be less work for someone in the future. >
> 1. Form of definition
> Which format do you think should be standard?
>
> {nerkla}:
> a. n1=3Dk1 enters n2=3Dk2 from origin k3 via route k4 using means/vehi= cle k5
>
> b.x1=3Dn1=3Dk1 enters x2=3Dn2=3Dk2 from origin x3=3Dk3 via route x4=3D= k4 using
> means/vehicle x5=3Dk5
>
> c. x1 enters x2 from origin x3 via route x4 using means/vehicle x5

a for lujvo where the arguments are in order, b where they are not in= order, c
for fu'ivla.

Sounds like a good rule.

> 2. Etymology
> I suggest that we don't add etymology info in the notes, but use t= he "add
> etymology"-link in jbovlaste.

I agree. That's what it's there for.

> I think etymology should be mandatory for cmevla and fu'ivla, so y= ou can
> discuss which language to borrow from.
> This is a paranthetical but important question if lojban has ambition = to be
> as cultural neutral as possible.
> So one recommendation could be that you always use latin for names of<= br> > species, the language most related to the specific cultural
> phenomena/object (or a derivate of languages if many cultures share th= e
> same phenomena/object, or in that case maybe esperanto).

I wouldn't always use Latin for species. "skomberu", &q= uot;polgosu", "sperlanu",
"merlanu", and "merluci" are all from Greek, Latin, or = some descendant thereof
(though "sperlanu" has a Germanic root), but for the capelin, an = important
forage fish that circles Iceland, I picked the Icelandic word as a source.<= br>

> 3. How much info in the notes?
>
> And also a final question: Is it possible for a user to edit another u= ser's
> notes in jbovlaste, to add info?

It is possible for one user to edit another's definition, but sho= uld be done
sparingly. Jbovlaste isn't Wiktionary.

Speaking of Wiktionary, there are Wiktionaries in English, French, Lojban, = and
other languages. The English Wiktionary, for any modern language (including=
Lojban), requires that three different people agree on a word at least a ye= ar
ago, or it appear in some well-known work. The Lojban Wiktionary doesn'= t, and
the French Wiktionary accepts Tsolyani words, which the English Wiktionary<= br> doesn't. You can enter phrases like "lo xamgu ko li'i" in= Wiktionary, but not
jbovlaste.

Pierre
--
La sal en el mar es m=E1s que en la sangre.
Le sel dans la mer est plus que dans le sang.

--
You received this message because you are subscribed to the Google Groups &= quot;lojban" group.

--
You received this message because you are subscribed to the Google Groups "= lojban" group.
To post to this group, send email to lojban@googlegroups.com.
To unsubscribe from this group, send email to lojban+unsubscribe@googlegrou= ps.com.
For more options, visit this group at http://groups.google.com/group/lojban= ?hl=3Den.
--047d7b6228204f8e5704d3471d6b--