From lojbab@lojban.org Thu Jul 25 00:13:58 2002 Return-Path: X-Sender: lojbab@lojban.org X-Apparently-To: lojban@yahoogroups.com Received: (EGP: mail-8_0_7_4); 25 Jul 2002 07:13:58 -0000 Received: (qmail 57143 invoked from network); 25 Jul 2002 07:13:58 -0000 Received: from unknown (66.218.66.218) by m11.grp.scd.yahoo.com with QMQP; 25 Jul 2002 07:13:58 -0000 Received: from unknown (HELO lakemtao04.cox.net) (68.1.17.241) by mta3.grp.scd.yahoo.com with SMTP; 25 Jul 2002 07:13:58 -0000 Received: from lojban.lojban.org ([68.100.206.153]) by lakemtao04.cox.net (InterMail vM.5.01.04.05 201-253-122-122-105-20011231) with ESMTP id <20020725071356.BNAY4949.lakemtao04.cox.net@lojban.lojban.org> for ; Thu, 25 Jul 2002 03:13:56 -0400 Message-Id: <5.1.0.14.0.20020725005938.03655ec0@pop.east.cox.net> X-Sender: lojbab@pop.east.cox.net X-Mailer: QUALCOMM Windows Eudora Version 5.1 Date: Thu, 25 Jul 2002 03:09:07 -0400 To: lojban@yahoogroups.com Subject: Re: [lojban] to-do list (was Re: New Members, Board of Directors, other LogFest results) In-Reply-To: <20020724201901.A1531@miranda.org> References: <5.1.0.14.0.20020724195628.032f4c80@pop.east.cox.net> <5.1.0.14.0.20020723195058.030913c0@pop.east.cox.net> <5.1.0.14.0.20020723025544.032cba90@pop.east.cox.net> <4.3.2.7.2.20010730221611.00b10c00@pop.cais.com> <5.1.0.14.0.20020723025544.032cba90@pop.east.cox.net> <20020723103956.E28971@miranda.org> <5.1.0.14.0.20020723195058.030913c0@pop.east.cox.net> <5.1.0.14.0.20020724122649.032e7ec0@pop.east.cox.net> <018e01c23350$150a6c00$73a1ca3e@oemcomputer> <5.1.0.14.0.20020724195628.032f4c80@pop.east.cox.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed From: Bob LeChevalier X-Yahoo-Group-Post: member; u=1120595 X-Yahoo-Profile: lojbab X-Yahoo-Message-Num: 14722 At 08:19 PM 7/24/02 -0600, Jay F Kominek wrote: >On Wed, Jul 24, 2002 at 09:28:12PM -0400, Bob LeChevalier wrote: > > >How complicated can a dictionnary spec be? > > > > I dunno. People seem to think we need something more than is being done, > > but there have been no articulated specifics. > >http://groups.yahoo.com/group/lojban/message/14204 >http://groups.yahoo.com/group/lojban/message/6906 > >http://nuzban.wiw.org/wiki/index.php?The%20Lojban%20Dictionary >http://nuzban.wiw.org/wiki/index.php?Great%20Dictionary%20Problem >http://nuzban.wiw.org/wiki/index.php?jbovlaste > >There have been more messages, but it'd take too long to find them, and >I just don't care at this point, beyond preventing this slandering. Sorry for offending. We have a couple of severe communication problems here, and I was quite stressed out when writing earlier (and getting interrupted by my kids every 2 minutes so I couldn't spend good time reviewing what I had written). 1. I had a VERY different idea what you meant above by a "dictionary spec". I come from the government software world where the magic word "spec" is a real document standing on its own, and not a discussion of possibilities with no agreement and resolution. I especially view the wiki as think tanking, and have no idea what parts of what is written on any page represent any consensus. 2. To the extent that there is a "spec", for example in the two list messages referenced, it looks like a spec for the data base tool that people use to enter data, etc. I don't see anything that tells me what the dictionary is supposed to look like, what it is supposed to contain, etc. 3. Most of the discussion on the wiki seems to be debating whether we should even be writing a dictionary at all, with all the politics of this cabal thing wherein I admit to not fully understanding what is wanted by which people. It appears that even Nick got frustrated with the debate and gave up. To all this I observe that the closest that I see anyone coming to a spec on the wiki is actually an example, and that is Nick's (I think) FAhA definition: http://nuzban.wiw.org/wiki/index.php?fa%27a%20%28Dictionary%29 That page actually looks more like a spec for the Elephant and how it will be used to create a cmavo dictionary, but his definition at the top of the page looks something like a dictionary definition (for someone well-versed in technical linguistic terminology like Nick) and it is something that actually tells me what ONE person thinks a good definition of FAhA looks like, possibly enriched by the examples further down. Whether the pro and con debate on that page has any place in a dictionary as opposed to a tool for writing one like the Elephant, I am not clear on, but printwise we can't print a dictionary with several-page-debates as the definition. It also seems quite unclear whether we would finish a meaningful Elephant for all the cmavo, much less any other words, in less than a decade. Unfortunately that page comes across as Nick's proposal, which received relatively little comment as a proposal for a format, but instead turned into a debate on the words fa'a and mo'i and the role usage will play in determining the dictionary definition, and even THAT was not resolved. But I responded positively to what I saw as the proposal on the page: >I think that what you've done is quite excellent as a start, though the >dictionary entry would have to be grossly abbreviated, with references to >the discussion/Elephant rather than the arguments themselves (the >arguments do not belong in the dictionary). It might indeed be possible to >come up with a way to cover the multiple approaches coherently, once they >are fully elaborated. (The only improvement I could request is to find a >way to more clearly link the examples to their appropriate definitional >segments.) --lojbab It looks like at some point, Nick was planning to actually create what he though the dictionary entry would look like, as the quote on "The Lojban Dictionary" indicates: >fa'a? needs to be made, using the content from fa'a (Dictionary) But that, if people had agreed upon whatever it said, would have been a useful "spec" to me of a good cmavo definition. For gismu, I've gotten the impression from many people, pc most recently in response to my comment on bancu a couple of days ago, that people want some sort of expansion of the current gismu definitions clarifying the places. Again, I see no consensus on what exactly people want - the main thing that good dictionaries have that we do not is of course examples that fully explicate the meanings. I'd love something like that, but I quickly suspect that between the gismu and the cmavo alone, based on the above two "specs", we are talking about a 200 page dictionary before we add a single lujvo. I have no idea what people want in a lujvo dictionary entry either, but presume that it will be similar to a gismu entry with the etymology explicated. The discussion of which lujvo to put in always turns into a debate about whether we should use new words or old words and whether my usage frequency data is worth anything. Your tool discussion seems to suggest something Elephant-like with different place structure proposals, labeled by originator being voted on by reviewers. While it sounds good in theory, it sounds like something that will either never be used (because adding new words takes precedence) or will be excessively used (because people are more interested in debating the conventions and meanings than in expanding dictionary coverage). I have my own ideas of what I would like to put in a dictionary myself, but I admit that I've never written it up either, nor any examples; it always seemed premature when I was not in fact going to follow through and no one else seemed to be doing so. I've always focused on what would be useful to the Lojban student for looking up the words the way we usually use a dictionary, whereas the Elephant version looks more like an adjunct to the Reference Grammar - a cmavo catalog after the fashion of the selma'o catalog - to which the average person would respond by throwing up his hands and retreating "too much information". Rather I think that what the world is looking for in a Lojban dictionary is something that shows how Lojban words map to the semantic space of other languages, something that requires LOTS of word entries that don't need to be particularly thoroughly defined (and indeed if not for the criticality of place structures when people actually go to use the dictionary, even THEY are of secondary importance. What people really want is "What is the Lojban word for X, or what possible words can I use for X, and how do I use them.) At any rate, to the extent that Nick's fa'a example was a spec for the Elephant, I can only presume that Cowan has been working on that or something like it. It isn't done, and you know at least as well as I do what the status is. If I come across as critical of "uncompleted work", then the Elephant, and your dictionary tools come to mind, and it is NOT that I really am "critical", but that in the absence of something there, I see nothing for me to either manage or use (assuming there is a role for me in either effort at all, and my understanding from Logfest last year was that I was NOT going to have any especial role in the Elephant), and I don't know what you think I should be doing that I am not. >I've been on this for quite awhile, trying to get feedback as to what >other people think is important in the system. And I get the impression that no one has any real opinions, or will have any, until you actually create the system. It is hard to get the average person excited about a software spec, especially when they are interested in the spec for the book. > > You have to understand that many people don't want to install ANY software > > downloaded from the net. > >I have, at no point, ever suggested doing this as anything except a web >based system. Anything beyond that would be entirely optional on the user's >part. > >I have some vague idea how to design software appropriate for a given >target audience. I've been doing it for quite awhile, now. I'm able to >eat and putz about on the Internet because I'm paid to develop >software. I'm told I'm fairly good at it. So could we maybe at least >stop with the concerns that all the evil, horrible, mean UNIX-using >Lojbanists are incapable of making something usable by plebes? Its >tiresome, and insulting. How is the above a rant against UNIX? It is a rant against the focus on the tools to make the book and not on making the book itself. It is a rant against working on-line or downloading and uploading software and files from on line, some variation of which is what I understand CVS to be, figure the Elephant will be and presume your tools will also be. I don't particularly see myself as having any special role in creating either files or entries in either your program or the Elephant, so my concerns are 1) are you producing something that people will actually use to enter data, 2) will they actually enter most or all of the data you are designing the data base to hold, or will we end up with a data base filled with oodles of gaps such that any automated dictionary produced from the data base will look like garbage because no two entries will have the same level of completeness of definition, and there will be no consistency in style, and 3) What are the actual dictionary entries supposed to look like, and what will I or someone else acting as "dictionary editor" actually need to do with the result in order to produce the book. > > But for me also it is simply learning curve. Time spent learning news > > software cuts into the time I can spend doing what I do now, and I don't > > have enough time for THAT. > >Most people can use their web browser and fill out forms. That means little to me, since I don't use web browsers to fill out forms. To the extent I do, they seem quite user unfriendly, either trying to automatically guess what I want to enter and filling it in based on matches with what I've entered before, or making me type the whole thing out, and having no particular editing tools other than that associated with point/click/copy/cut/paste. For someone used to editing with a full screen text editor with powerful commands and macro capability, this sounds like torture, but if you think people will fill in such forms, then let's see the forms and lets see them do it. I guess I'm saying that the flat files are out there somewhere on the web site, and no one is doing anything. If you've got an idea for a tool whereby people can and will do something with it that gets us closer to a dictionary, then more power to you and go ahead - I'm not interested in debating the pros and cons, but I am interested in the results and the demonstrated usage of the tool to advance the work. Please forgive my skepticism that what you propose will get any more words defined than have already been defined - I would LOVE to be proven wrong. >(And, since you'll likely not read the URLs I've posted above, I read them, and I read them the last time we debated this wherein I think you posed the same or similar URLs. >I'll simply >readdress the concern you had last time, wherein you said that you normally >did a few hundred words at a time, and wouldn't want to deal with a form >each time. (How many times have you done a block of a few hundred words, by >the way? Recently, not at all of course %^). When I last worked on the word-frequency lujvo file I indeed worked on the entire list, jumping around and annotating and correcting word forms for around 500 words in a sitting. When I last worked on the KWIC files that are used to generate the text of the form ENGDICT.GIS, I again did a lot of jumping around, but modified around 200 entries and deleted even more. Each of these sessions was several hours a days for a couple of days - I don't start dictionary work unless I'm going to spend a lot of hours on it. In developing the gismu list itself, the final editing pass that I did in 1994 consisted again in jumping around and comparing ALL 1300 odd words at once, while stepping through them alphabetically. I spent probably over 100 hours over the course of a month or so on that final pass, after a pre final pass of a comparable number of hours spread over a longer time (around 3 months if I recall). When I say that I can't think of things that people can do in less than 40 hours, it is because I seldom tackle tasks that are anywhere near that small myself. I don't break work down into chunks that small, and I think of the big picture, and not the little pieces. This is perhaps a failing of mine if I am to be seen as a "manager" rather than as a "leader" and especially a "leader by example" which is what I traditionally thought myself - though I admit my example has been pretty poor of late. >There are only 1350 some odd gismu. And what are you doing with >them?) I can easily set something up so that people who want to make a >bulk contribution can write themselves up a specially formatted text file >and upload that to the web site. Jay, I've been on the net for years, and yet I have NO IDEA how to upload a file to the Web except using ftp commands after Telneting into a shell account. That is the ONLY way I've ever done it or seen it done, though I'm sure you know many others in use. I presume that CVS has something FTP-like to do the same function, but I've never seen how it works. And I've never been prone to figuring things out by fiddling with them, so I'm unlikely to even try until shown how personally some LogFest. If I'm that way, and won't know what you are talking about until I actually see it or read a good users manual of the sort I doubt would ever be written, what about the "AOL-mentality" type you have implicitly criticized that are the majority of those connected to the net? But whether you have an answer for that question or not, my point is that I can only make comments in ignorance until I can actually see what you have in mind, with data in it, and being shown how that data will produce the book. I've been a software developer oh so long ago, but on this effort I am a (l)user who is ONLY focused on how >I< will use the tool to get what I want, and hoping that the wise software developer will be giving me something I can use, and also giving everyone else something they can use. (I should add, BTW, that now that I am connected via cable modem, SOME of my bias against on-line lookups is dwindling. I still hate to wait for the lag times, and I hate the clumsy net search functions that don't so far as I know handle even simple regular expressions. I still download and use offline copies of anything I use more than once or twice if I have the option of doing so. But at least I have some vision of what people who live always connected to the net might be able to do if they weren't as stuck on their old ways of doing things as I tend to be; I still prefer my ways, and I think that those who live always-connected have a different perspective and mindset than those who use dialup and 56K or slower modems. I now see for example the possible virtue of hypertext, which was always too slow and cumbersome at low data rates, and rarely usable off line. But I am also still one of those old fashioned types who sometimes actually reads a dictionary as a book, skimming through a couple pages of entries for interesting words.) > Why, hypothetically, it could even read stupid fixed column width ASCII > files.) %^) > > I have the impression that several people have done that already. Since I > > can't imagine using the computer to look up individual words, I've never > > seen much use in it. > >Looking a word up as quickly as you can type it is in no way appealing to >you? I avoid typing when I can. I skim files. When I search a textfile with my full screen editor, I prefer to have it select and highlight all occurrences of the line containing the text with context at once, and then skim those. I use what you describe on the web and CAN use the bloody "find"/"find next" popup that in Windows passes for a search function, but it is painfully inefficient for me to step through the find text one item at a time to find the spot on a page that I am looking for (if it is there - else I have to load another page and another until I find it.) -- lojbab lojbab@lojban.org Bob LeChevalier, President, The Logical Language Group, Inc. 2904 Beau Lane, Fairfax VA 22031-1303 USA 703-385-0273 Artificial language Loglan/Lojban: http://www.lojban.org