From nobody@digitalkingdom.org Sun Aug 14 14:41:04 2005 Received: with ECARTIS (v1.0.0; list lojban-list); Sun, 14 Aug 2005 14:41:04 -0700 (PDT) Received: from nobody by chain.digitalkingdom.org with local (Exim 4.52) id 1E4QDu-0005dW-3T for lojban-list-real@lojban.org; Sun, 14 Aug 2005 14:40:54 -0700 Received: from web81308.mail.yahoo.com ([206.190.37.83]) by chain.digitalkingdom.org with smtp (Exim 4.52) id 1E4QDq-0005dO-Fx for lojban-list@lojban.org; Sun, 14 Aug 2005 14:40:53 -0700 Received: (qmail 46036 invoked by uid 60001); 14 Aug 2005 21:40:49 -0000 Message-ID: <20050814214049.46034.qmail@web81308.mail.yahoo.com> Received: from [68.88.33.134] by web81308.mail.yahoo.com via HTTP; Sun, 14 Aug 2005 14:40:49 PDT Date: Sun, 14 Aug 2005 14:40:49 -0700 (PDT) From: John E Clifford Subject: [lojban] Re: Loglish: A Modest Proposal To: lojban-list@lojban.org In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 X-Spam-Score: -1.6 (-) X-archive-position: 10347 X-ecartis-version: Ecartis v1.0.0 Sender: lojban-list-bounce@lojban.org Errors-to: lojban-list-bounce@lojban.org X-original-sender: clifford-j@sbcglobal.net Precedence: bulk Reply-to: lojban-list@lojban.org X-list: lojban-list --- Ben Goertzel wrote: > > Steve, > > My idea with the "qui" connector in Loglish is > not that different from your > idea of using WordNet. > > The idea is that, rather than memorizing > separate words for each WordNet > sense, one uses context-specifiers to indicate > which sense is intended. > So, for instance you could say > > Ben rock qui sway baby > > Ben listen rock qui music > > This avoids the need to memorize separate words > for the different senses of > rock ("rock" as in "rock the baby" and "rock" > as in "rock music"). The example, though I assume it is not meant seriously, illustrates the problem that has been found in these regimented English proposals: the whole range of ambiguity of the natural language creeps in -- even if we begin with narrower specifications of meaning. and then we have to modify each word to get away from that, but we forget or even the usual modifications do not always work. And, of course, "words" and thus sentences get longer and longer. Lojban at least begins with fairly unambiguous words with no inherent tendency to expand those base concept (what we carry over from our native languages is another -- but, given the community -- usually less significant problem). Loglish also loses what is practically Lojban's most significant feature for any computer use: the unique decomposition and parsing. The unique decompositionn goes with its abssence in Englsh. Were that problem solved, it *might* be possible to restore the unique parsing -- but it is not obviously the case and the decompsition problem looks at least enormously difficult. I always like the Chinese model for things and so I kinda like "qui" and "quu." Chinese has a very small list of syllable (even taking tone into account) and is conceptually a monosyllabic language. To avoid ambiguity in writing (and more so now with the various simplification) most characters consist of a phonetic marker for the syllable (being updated, but historically with Tang or earlier pronunciation) and a radical that classifies the referent in so way -- mostly adequate for distinguishing several dozen meanings for the same syllable (several dozen words, as it were). In the spoken language the problem is solved by becoming polysyllabic covertly. An ambiguous syllable -- and most are -- is accompanied by another (or several other) syllables in a fixed phrase that functions as a unit and does not bear analysis, generally speaking. Of course, much the same pattern of words is also used to make new compound meanings which may equally become frozen. I suppose that "qui" -- and in another way "quu" -- would come to function like this in Loglish, both disambiguating simple expressions and constructing new complexes. It seems a viable -- though remarkably messy and uninteresting -- idea. > I didn't say so in the Loglish language > specification, but there is probably > a need for a qui terminator just in case the > context-specifier is more than > one word, so one could say (using "quiha" for > the terminator) > > Ben listen rock qui music quiha > > (unnecessary in this case but useful in rare > cases where more than one word > is used in the position "music" is used here) > > One could argue that qui is unnecessary because > tanru can handle > disambiguation, but I think it's better to have > a specific mechanism for > sense-specification as opposed to > compound-concept-formation. > > -- Ben > > > > > -----Original Message----- > > From: lojban-list-bounce@lojban.org > > [mailto:lojban-list-bounce@lojban.org]On > Behalf Of Steven Arnold > > Sent: Saturday, August 13, 2005 8:12 PM > > To: lojban-list@lojban.org > > Subject: [lojban] Re: Loglish: A Modest > Proposal > > > > > > > > On Aug 13, 2005, at 4:00 PM, Arnt Richard > Johansen wrote: > > > > > To quote your web page: > > > > > > # [...] avoid what's really annoying about > Lojban (the lack of a full > > > # vocabulary). > > > > > > I suppose that lack of vocabulary will > always be a problem in > > > knowledge representation systems, until > someone develops AGI or a > > > way to extract a suitable dictionary from a > text corpus. > > > > Wordnet is a system that attempts to take a > set of "core meanings" > > and associate those meanings with words from > different languages. It > > is accessible over the Internet. I invented > a language by writing a > > program in Python that fetched the list of > core meanings and assigned > > words to them from a list. It was a very > fast route to a 26,000+ > > word dictionary. Granted, the dictionary > needed a little data > > grooming -- there were a number of words > that, to me, didn't deserve > > a separate term. There were also words that > I wanted to make sure > > got shorter words, since I expected them to > be used more often. But > > I think the data grooming was by far the > minor portion of the task, > > and by using Wordnet, I saved probably > hundreds of hours of word > > development compared to doing it all by hand. > > > > That, combined with using Markov chains for > word generation, created > > an excellent base language in a very short > time. I'd be happy to > > share the source code of these tools with > anyone who is interested; > > email me privately for that. > > > > steve > > > > > > > > To unsubscribe from this list, send mail to > lojban-list-request@lojban.org > > with the subject unsubscribe, or go to > http://www.lojban.org/lsg2/, or if > > you're really stuck, send mail to > secretary@lojban.org for help. > > > > > > > > > To unsubscribe from this list, send mail to > lojban-list-request@lojban.org > with the subject unsubscribe, or go to > http://www.lojban.org/lsg2/, or if > you're really stuck, send mail to > secretary@lojban.org for help. > > To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.