[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[lojban] Re: Loglish: A Modest Proposal
On Aug 13, 2005, at 4:00 PM, Arnt Richard Johansen wrote:
To quote your web page:
# [...] avoid what's really annoying about Lojban (the lack of a full
# vocabulary).
I suppose that lack of vocabulary will always be a problem in
knowledge representation systems, until someone develops AGI or a
way to extract a suitable dictionary from a text corpus.
Wordnet is a system that attempts to take a set of "core meanings"
and associate those meanings with words from different languages. It
is accessible over the Internet. I invented a language by writing a
program in Python that fetched the list of core meanings and assigned
words to them from a list. It was a very fast route to a 26,000+
word dictionary. Granted, the dictionary needed a little data
grooming -- there were a number of words that, to me, didn't deserve
a separate term. There were also words that I wanted to make sure
got shorter words, since I expected them to be used more often. But
I think the data grooming was by far the minor portion of the task,
and by using Wordnet, I saved probably hundreds of hours of word
development compared to doing it all by hand.
That, combined with using Markov chains for word generation, created
an excellent base language in a very short time. I'd be happy to
share the source code of these tools with anyone who is interested;
email me privately for that.
steve
To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.