On Aug 13, 2005, at 4:00 PM, Arnt Richard Johansen wrote:
To quote your web page: # [...] avoid what's really annoying about Lojban (the lack of a full # vocabulary).I suppose that lack of vocabulary will always be a problem in knowledge representation systems, until someone develops AGI or a way to extract a suitable dictionary from a text corpus.
Wordnet is a system that attempts to take a set of "core meanings" and associate those meanings with words from different languages. It is accessible over the Internet. I invented a language by writing a program in Python that fetched the list of core meanings and assigned words to them from a list. It was a very fast route to a 26,000+ word dictionary. Granted, the dictionary needed a little data grooming -- there were a number of words that, to me, didn't deserve a separate term. There were also words that I wanted to make sure got shorter words, since I expected them to be used more often. But I think the data grooming was by far the minor portion of the task, and by using Wordnet, I saved probably hundreds of hours of word development compared to doing it all by hand.
That, combined with using Markov chains for word generation, created an excellent base language in a very short time. I'd be happy to share the source code of these tools with anyone who is interested; email me privately for that.
steve