[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[lojban] Re: Loglish: A Modest Proposal



Steven Arnold wrote:
Wordnet is a system that attempts to take a set of "core meanings" and associate those meanings with words from different languages. It is accessible over the Internet. I invented a language by writing a program in Python that fetched the list of core meanings and assigned words to them from a list. It was a very fast route to a 26,000+ word dictionary. Granted, the dictionary needed a little data grooming -- there were a number of words that, to me, didn't deserve a separate term. There were also words that I wanted to make sure got shorter words, since I expected them to be used more often. But I think the data grooming was by far the minor portion of the task, and by using Wordnet, I saved probably hundreds of hours of word development compared to doing it all by hand.

That, combined with using Markov chains for word generation, created an excellent base language in a very short time. I'd be happy to share the source code of these tools with anyone who is interested; email me privately for that.

That is at odds with the way we add new words to Lojban. We make compounds called "lujvo" from gismu, or we borrow words from other languages, usually one of the Big Six or biological Latin, though we have a handful of Tupi words (mandioka, markuja) and one from an Algonquian language (ckankua). The only exceptions that come to mind are {tsaparatsa'i}, which is my attempt to imitate the rhythm called "ratamacue", and {vonpaso}, which has fu'ivla form but is made from Lojban words. Most of the original gismu were made by putting together bits and pieces of words from the Big Six using a weighting algorithm. AFAIK no Lojban word was made by a Markov chain.

phma


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.