[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[lojban] Re: Loglish: A Modest Proposal




On Aug 13, 2005, at 4:00 PM, Arnt Richard Johansen wrote:

To quote your web page:

# [...] avoid what's really annoying about Lojban (the lack of a full
# vocabulary).

I suppose that lack of vocabulary will always be a problem in knowledge representation systems, until someone develops AGI or a way to extract a suitable dictionary from a text corpus.

Wordnet is a system that attempts to take a set of "core meanings" and associate those meanings with words from different languages. It is accessible over the Internet. I invented a language by writing a program in Python that fetched the list of core meanings and assigned words to them from a list. It was a very fast route to a 26,000+ word dictionary. Granted, the dictionary needed a little data grooming -- there were a number of words that, to me, didn't deserve a separate term. There were also words that I wanted to make sure got shorter words, since I expected them to be used more often. But I think the data grooming was by far the minor portion of the task, and by using Wordnet, I saved probably hundreds of hours of word development compared to doing it all by hand.

That, combined with using Markov chains for word generation, created an excellent base language in a very short time. I'd be happy to share the source code of these tools with anyone who is interested; email me privately for that.

steve



To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.