Return-Path: Received: from SEGATE.SUNET.SE by xiron.pc.helsinki.fi with smtp (Linux Smail3.1.28.1 #1) id m0tHcWl-0000ZUC; Mon, 20 Nov 95 22:09 EET Message-Id: Received: from listmail.sunet.se by SEGATE.SUNET.SE (LSMTP for OpenVMS v1.0a) with SMTP id EB73D888 ; Mon, 20 Nov 1995 21:09:18 +0100 Date: Mon, 20 Nov 1995 15:06:54 -0500 Reply-To: Logical Language Group Sender: Lojban list From: Logical Language Group Subject: lujvo and rafsi, response to Mark X-To: lojban@cuvmb.cc.columbia.edu To: Veijo Vilva Content-Length: 12627 Lines: 259 Markl wrote: >I also intended for _some_ examples to suggest a secondary point, which >I was content to make only implicitly. I'll make it explicitly now. >Seems to me that rafsi serve, or can serve, two purposes. Rafsi can be >used to convert a commonly used or otherwise valuable tanru into a form >which enjoys a standardized definition & a standardized place structure; >that is, into a lujvo. But rafsi can also be used to convert a commonly >used or otherwise valuable tanru into a form which is short, but not >standardized; that is, into a nonce-lujvo. Accordingly, some of my >examples were of ideas that I wanted to express in Lojban using fewer >than the four syllables minimally required for tanru. I sympathize with this point, though I have found more recently (perhaps because I do less Lojban writing/speaking) that I am content to make nonce-lujvo solely by the unreduced form method, except for a few key rafsi in key positions that have started to become "morphological suffixes" like final position "-mau". It is only when I find that I using a concept often that I bother to look up the rafsi and/or verify against tosmbru form, etc. to get an optimal short-form. And this habit I have developed seems perfectly in accordance with Zipf, that frequency of use ties to shortness of word-form. Nonce-lujvo are practically by definition the most infrequent words in a language. >> > hour-long (cacra, hour, no short rafsi), >> >> Croatian uses the adjective "jednosatni" (4 syllables), which >> translates directly to lojban as {pavcacra} (3), or, using the x2 >> default, just {cacra} (2). I am satisfied. > >The x2 default? I thought all gismu defaulted to x1. I've missed >something here, which may mean that my "hour-long" example is already >covered by an economical Lojban expression. By this, Goran meant that the x1 of cacra is something that is x2 hours long. The default of x2 in normal contexts is singular (1), so that lo cacra defaults to mean something that is one hour long. >> I have never yet been in a situation where I would have to explicate >> the material of a can, and if I ever am I would gladly use tanru. > >Yes, I'm sure tanru would suffice to describe the material of a can. >But how would you succinctly refer to the idiomatic "tin cans," that is, >to all cans which are not aluminum beverage cans? Give me a sentence for context and I would know better what I would do (after all, there are other kinds of cans besides beverage cans and "tin cans", such as "paint cans" and "garbage cans". So my best guess would be cidjylante as a starting point, assuming that I know what you intend to cover by the term. But then "tin" is specifically mentioned as having a metaphorical meaning in tanru - though many if not most Lojbanists reject those kinds of metaphorical usages for animals/plants/metals and many body parts. >So the answer to your question is that I will want economical >expressions for everything that has a great deal to do with my >livelihood, & that others will want economical expressions for >everything intimately involved with theirs. What resources does Lojban >have to offer for the construction of such narrowly differentiated >economical expressions? Answer - it doesn't. Expressions used frequently by only a small portion of the community constitute jargon, and there is a tradeoff between having unambiguity and specificity that forbids having a lot of short-word space reserved for undetermined words of low overall frequency of use in the language across the whole population. Because of very low ambiguity, Lojban will by necessity tend to have longer phoneme strings in order to cover the same semantic space. Either that, or we have to increase density (and homonymy isn't acceptable), which reduces communications redundancy when someone misspeaks or mishears. People who have tried to make preliminary estimates feel that Lojban's redundancy is already below that of most natural languages (easily seen if you realize that nearly all word forms of the shapes CVVCCV CCVCVV CCVCCV CVCCCV CVCCVV (among 6 letter lujvo) have a potential meaning so that one phoneme errors cause significant difficulties in error correction. >Incidentally, English has monosyllabic words for the tidal rise in water >level (flow), the tidal fall in water level (ebb), the highest high tide >(spring) & the lowest high tide (neap or neaps). Two of these terms >(spring & flow) are ambiguous, in that they also have other meanings. > >One could argue that "flow" has a single meaning, of which "tidal flow" & >"river flow" are merely types, & then argue that the word "current" >represents river flow so well that, in a coastal context, the "tidal" >portion of "tidal flow" can simply be elided. Except that in a coastal context, I would tend to associate that phrase (even *with* the "tidal" remaining) as referring to "rip tides" and the like that make shore areas dangerous for swimmers. Most people do not think of tides in terms of currents, though perhaps the sailors who made th original metaphor "flow" did so. This shows one of the problems in too hastily making a very short lujvo for a "local" context. If that lujvo can have only one meaning, the assignment of the lujvo to the narrow use concept means that it cannot be used for any more broadly understandable tanru semantics based on the words. One result of this has plagued the TLI Loglan community. One of their "games" has been to take some narrow field that a person is interested in, and then devise a whole bunch of lujvo that fit that narrow field. This started shortly after Brown printed his first books, with articles in the first issues of The Loglanist dominated by such things as lists of color words (additive/subtractive and whatever), computer terminology up the wazoo, etc. More recently, an issue of Lognet had a whole bunch of "short" words related to stereo equipment (CD, cassette, rewind, fast forward, etc.). The problem was that almost every tanru proposed would suggest whole bunches of meanings unrelated to stereo equipment in any context other than a discussion of such things, and it was obvious that the person writing the article hadn't for a moment considered non-stereo interpretations of the tanru. Some other examples - I give only the keywords and syllable counts (assuming optional disyllables are spoken as 2 syllables) for the underlying tanru; in some cases Lojban has gismu that TLI Loglan does not, words that would solve the problem more easily. You'll noptice that the shortest syllable counts are generally the worst tanru: 2 man-do (the classic example : to "man" a ship - as if women never do this or that you can't think of 50 million more likely interpretations for that tanru. 3 scratch-record (phongraph record) 3 scratch-record-machine (phonograph) 4 ribbon-record (tape recording) 4 ribbon-record-machine (tape recorder) 3 record-ribbon (recording tape) 3 sound-ribbon (audio tape) 5 record-tape-container (cassette) 5 sound-tape-container (audio cassette) 3 light-record (CD) 4 video-ribbon (videotape) 6 video-ribbon-container (videocassette) 4 video-record (video recording) 4 video-record-machine (VCR) 2 record-use (to play a recording on a machine) 3 fast-record-use (fast forward) 4 see-fast-record-use (scans on fast forward) 4 reverse-record-use (rewind) 4 see-reverse-record-use (scans on rewind) 3 record-pause-make (pauses a recording) 3 record-offer (ejects - of a machine) 4 record-offer-make (agentive ejecting) 5 blood-lose-sick-person (hemophiliac) 3 possible-fiction (a superset of SF and fantasy, as if other kinds of fiction are inherently "impossible", while fiction involving magic IS possible. 4 science-possible-fiction (SF) 4 magic-possible-fiction (fantasy) 5 science-magic-possible-fiction (science fantasy) 3 people-story (folk tale) 3 magic-people-story (fairy tale) 3 religion-people-story (myth) 3 history-people-story (legend) 3 crime-discover-attempt (detective, one who identifies perpetrators of crime) 3 sad-end-story (tragedy) 3 happy-end-story (comedy, in the old sense) 3 funny-story (comedy, current sense) 4 four-direction-ball (hypersphere) some words for sign language - most not too bad, but think creatively about other contexts and they become a little less obvious: 3 hand-sign (gesture) 4 hand-sign-form (hand shape, I presume a jargon term for signers) 4 hand-sign-language (sign language) 3 hand-letter (letter from manual alphabet) 4 hand-letter-write (finger spell) 4 direction-do-word (directional verb including pronomial references in its motion) 4 class-hand-sign (classifier signifying a member of an object class) 3 sex-attack (rape) 2 process-trouble (harass/pester) 3 beautiful-write (do calligraphy) 3 do-exist (functionally exists with effect x2; x1 is virtual) >English has an oceangoing history. Other languages reflect greater >intimacy with arid desert. I wanted to show that my rafsi critique was >inclusive of both environmental extremes, & of the cultures concerned >with them. So I gave both "high tide" & "salt pan" as multicultural >examples of lexical needs unmet, or met only laboriously, by Lojban. And these (especially "salt pan") are excellent examples of why we need to be very careful in making short lujvo. It is likely that someone far from the ocean won't understand a metaphor based on tides, and someone far from a desert won't understand a metaphor related to the desert. But Lojban, in order to be culturally neutral, cannot favor either community by letting them make short lujvo that the other group will not understand, or even worse, might use with a totally unrelated meaning. >la dn cusku di'e >> A succinct tanru is as useful as a compacted lujvo, in fact I would say >> the tanru would be clearer in many circumstances. > >If so, that's only because so many rafsi are dissimilar to their gismu >"primitives." No, it is because rafsi use inherently loses some redundancy in phonemic recognition. If syllable counts are the critical measurement factor then no one would use an unreduced lujvo over the equivalent tanru (you've lost one phoneme), and no one would use any disyllable rafsi over the expanded form (you've lost 2 phonemes). But no one has in speech to my knowledge chosen fukpyvla over fu'ivla for clarity (in writing it takes a whole extra character, so it may be more understandable %^); and indeed nor has anyone ever had trouble with learning "-vla" even though it is the same type of "backwards" rafsi as "kerfa". >> I don't think that having compounds which are, say, 2 syllables instead >> of 4 is that significant to a language. > >Then why do so many two-syllable compound words exist in various tongues? Because as lot of tongues have one syllable primitive roots. However, I will note that Lojban DOES have a lot of two-syllable compound words Lessee, there are 66 CVV monosyllable rafsi in use, 210 CCV rafsi in use, and 916 CVC rafsi, which statistically will not need hyphenation in 179/289 (~60%) of all CVCCVV lujvo and a similar percentage of CVCCCV lujvo. So there are 4356 CCVCCV lujvo, 13860 CVVCCV bisyllable lujvo, around 36640 CVCCVV bisyllable lujvo, around 115000 CVCCVV bisyllable lujvo, and arguably 44100 CVVrCVV lujvo, which can usually be pronounced as bisyllables. That is well over 200000 bisyllable lujvo, far in excess of the average person's working vocabulary in any language, and likely larger than the entire lexicon of most languages. The fact that most of these have not been used, and indeed perhaps most are unlikely to be useful by any stretch of the imagination, does NOT detract from the fact that they "exist" and therefore have meaning. And I dare predict that the fraction of them that will have useful meaning covers a higher percentage of the N most common concepts than the percentage of 2 syllable compounds doing so in most other languages, and the total number with useful meaning will probably be far higher as well. (Chinese may beat us out because such a high percentage of their wordstock are 2-syllable compounds). If you go to three syllables, I think the number of Lojban lujvo probably FAR exceeds even the total wordstock of English, though I haven't calculated it out. (But there are 914,760 of each of the following forms: CCVCVVCVV CVVrCCVCVV CVVrCVVCCV alone, 2.9 million CCVCCVCVV CVVCCVCCV and CCVCVVCCV, 9.3 million CCVCCVCCV words and probably over 5 million CVCCVCCCV that need no hyphen. That's over 25 million words to start with. Any volunteers to start on THAT dictionary??? lojbab