[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lojban] Re: IRC logs and text archives - volunteers wanted

To: lojban@yahoogroups.com
Subject: Re: [lojban] Re: IRC logs and text archives - volunteers wanted
From: Robert LeChevalier <lojbab@lojban.org>
Date: Thu, 14 Nov 2002 05:05:58 -0500
In-reply-to: <20021114044359.GA71467@allusion.net>
References: <5.1.0.14.0.20021113231400.0337d580@pop.east.cox.net> <5.1.0.14.0.20021113231400.0337d580@pop.east.cox.net>

At 10:43 PM 11/13/02 -0600, Jordan wrote:

On Wed, Nov 13, 2002 at 11:23:11PM -0500, Bob LeChevalier-Logical Language =
Group wrote:
> Robin P says that there has been a lot of activity on IRC for a while, bu=
t=20
> that in general he is not logging it and does not know of anyone else who=
 is.
>=20
> Does anyone have a collection of Lojban IRC logs?  We are going to be=20
> looking for Lojban text corpera in the next several weeks for dictionary=
=20
> work, and if a lot of Lojban conversation is taking place on IRC, that=20
> conversation should be included in the corpera.

I have essentially noninterrupted logs (10 megs of em) since Sun
May 12 08:40:20 2002, when I first joined.

That's a lot! I wonder if Robin has room for that much (and more if itkeeps accumulating at that rate).

What percentage of it would you say is IN Lojban, as opposed to beingdiscussion in English (or other languages) ABOUT Lojban

However, I wonder what the interest in such text could be?

When we say "let usage decide", "usage" is NOT limited to major translationefforts. If we look at the text archives of stuff on the list, andtranslations, it is heavily dominated by a couple of Lojbanists (Nick andGoran in the early days, Jorge and xod more recently). Robin P. haspointed out that there are people active on IRC that are not active on thelist and in other forums, and this suggests that we would have a muchbroader spectrum of usage, from more members of the community, than we canget from the existing text archives.

It's all 'conversation quality',

Conversation is a rather important form of language usage, is it not? Thequestion is not whether its quality is "conversational" but whether itrepresents "skilled usage", and that obviously has to be evaluated bylooking at the whole text of the person who wrote it, as well as theaudience of who he was writing to, rather than a single snippet ofconversation out of context.

"Conversation quality" actually represents a very desirable thing in acorpus of usage. If the speakers are skilled users, it represents moreclosely the way fluent use of the language works, whereas translations andother non-real-time writings are NOT usually "fluent" but rather"considered efforts". When we are looking at how the language usagereflects "logic" we may want to focus on considered usage; when we want tolook at how people tackle problems of idiomatic expression, we can compareconversational usage to the comparable idioms of the native language of thespeaker.

and anyone who wants some of that sort
of Lojban text can just go on irc (at the right times), and there'll
likely be a few people around to talk to bau la lojban.

The point is not to merely be able to find sample Lojban texts, but to beable to assemble as large a corpus of Lojban usage as possible, so we cango delving to find out if certain obscure (in meaning) cmavo have beenused, and in what manner they have been used by multiple people. We wantto be able to determine NOT what jboske says the word "should" mean, butwhat usage has said it "does" mean to people.

An upcoming major push on the Lojban dictionary requires that we be able tofind out if words have been used, and whether they've been used in the wayLojbab intended as opposed to other plausible ways to interpret the wordsthat appear in the gismu and cmavo list which some people have understooddifferently than Lojbab intended %^)


Nick has cited as a proper use of corpera, the actual usage of "vo'a"

Once we move in dictionary writing from prescriptive language definition todescriptive reflection of actual usage, this will become even moreimportant. Defining lujvo is more of a descriptive effort, since the placestructure rules in CLL are just guidelines.

Or is this for word frequency-type infos?

That too is a valid use, though not the one I had in mind. If your 10 megsis substantially Lojban, it is decidedly better data than Lojban List,which has a very low percentage of actual Lojban text, and much of it issnippets and word-proposals and repeated quotations that can seriously skewany word frequency analysis.

Another possible use is for conversation examples for further efforts at aLojban textbook. Authentic conversation is far more interesting to learnfrom, than are canned "dialogs" that don't actually represent what any*normal* person would say in a conversation. %^)

Anyway I'm happy to provide them if someone wants them.

We definitely will want them - heck, *I* want them for the LLG archive, butI think an on-line archive is at least as important as my having files hereon my computer constituting the "official archive".

We need to find someone willing to index them (and perhaps to weed out anylogs that do not have any substantial Lojban text - discussions about thelanguage are interesting but are not a corpus of language usage), and toput them on a site where they can be looked at (lojban.org orelsewhere). And if they get put on a web site, I'd like the group I'veasked for to maintain a list of web sites with Lojban text to include it.


lojbab

--
lojbab                                             lojbab@lojban.org
Bob LeChevalier, President, The Logical Language Group, Inc.
2904 Beau Lane, Fairfax VA 22031-1303 USA                    703-385-0273
Artificial language Loglan/Lojban:                 http://www.lojban.org

Follow-Ups:
- [lojban] Re: IRC logs and text archives - volunteers wanted
  - From: Robin Lee Powell <lojban-out@lojban.org>
- [lojban] Re: IRC logs and text archives - volunteers wanted
  - From: Jordan DeLong <lojban-out@lojban.org>

References:
- [lojban] IRC logs and text archives - volunteers wanted
  - From: Bob LeChevalier-Logical Language Group <lojbab@lojban.org>
- [lojban] Re: IRC logs and text archives - volunteers wanted
  - From: Jordan DeLong <lojban-out@lojban.org>

Prev by Date: [lojban] Re: IRC logs and text archives - volunteers wanted
Next by Date: Re: [lojban] Project volunteers anyone?
Previous by thread: [lojban] Re: IRC logs and text archives - volunteers wanted
Next by thread: [lojban] Re: IRC logs and text archives - volunteers wanted
Index(es):
- Date
- Thread