[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
perl and lojban, sitting in a tree..
- Subject: perl and lojban, sitting in a tree..
- From: "Michal Wallace (sabren)" <sabren@manifestation.com>
- Date: Sun, 24 Oct 1999 18:22:57 -0400 (EDT)
Hey all,
I just joined this list yesterday and have been reading through the
archives. I don't understand any lojban yet, but I'm interested in
learning.
I'm also very interested in the AI and machine translation
possibilities of the language. I brought the issue up on the perl-AI
list, and thought perhaps it might be a good first post here, as well.
Looking forward to learning with everyone,
- Michal
-------------------------------------------------------------------------
http://www.manifestation.com/ http://www.linkwatcher.com/metalog/
-------------------------------------------------------------------------
---------- Forwarded message ----------
Date: Sun, 24 Oct 1999 18:15:26 -0400 (EDT)
From: "Michal Wallace (sabren)" <sabren@manifestation.com>
To: perl-ai list <perl-ai@netizen.com.au>
Subject: perl and lojban, sitting in a tree..
This one's long. Synopsis: computers can't yet grok most human
concepts and languages, but they do just fine with computer science.
Why not tackle a slightly simpler problem: translating computer
languages, with lojban as the intermediary language?
Hey all,
This talk about language has got me thinking. John Nolan made an
excellent point when he asked, why should a computer (AI robot) talk
to you at all? It seems to me that language doesn't make much sense
without something to communicate, and someone willing and able to
listen. It doesn't make sense for computers and humans to talk about
baseball, because computers don't really care about baseball in the
real world. However, given a virtual world, computers can *play*
baseball, keep score, and interact with humans. The Sapir-Whorf
hypothesis states that language limits experience. But isn't the
reverse also true?
Consider the visible spectrum. I've been told that 24 bit RGB offers
more colors than the human eye can perceive. I'm not sure that it can
account for every color the eye can see, but it offers more shades of
"red", for example, than the human eye can visually distinguish. With
so many possible colors, most languages allow us to "see" only a
handful of colors. You can test this yourself. Just visit
http://www.lynda.com/hexh.html and see how many colors you can name. [*]
The Sapri-Whorf hypothesis would suggest that, for the most part, we
usually only experience "red" - not the many individual shades of red,
because experience is shaped by our ability to code it in language.
But the opposite is also true: at one point in time, it would have
been outlandish to talk about "colors" we can't actually see. Yet,
once science delivered us a theory of electromagnetics, we can talk
about mirowaves, radio waves, infra-red, ultraviolet, X-rays.. all of
which are invisible "colors". We can talk abou these things now,
because we experience them. But those words would have been wasted on
the ancient greeks, because they never experienced the phenomena the
words describe.
Computers don't normally experience the physical world. Yes, you can
attach a microphone, a couple quick-cams for binocular vision, a
robotic arm, and some motorized wheels - and with enough knowledge,
and work, it might even work its way around the room. :) But that's a
robot, not a computer. A "computer" entity would have experiences
completely different from your average human's, and therefore, an
"intelligent" computer's language would reflect that.
That is, it would be far more natural for AI's to talk about
databases, algorithms, logic, and applications than to talk about the
Atlanta Braves. And in fact, we talk with computers about these things
all the time. We use languages such as perl, java, SQL, lisp... [**]
The more I read about Lojban, the more I think that it makes sense
as a "native language" for computers. It's logic-based (and seems to
share a lot in common with lambda calculus / lisp / prolog - but I've
only a superficial understanding of any of those OR lojban, so don't
take my word for it).. It's got a well-defined YACC grammar, requires
no particular inflections or stress (for text-to-speech
readers).. It's supposedly always obvious where each word ends (for
speech-to-text listeners).. Requires only a handful of ASCII
characters.. It is said to be quite expressive and easy to learn
(although there's only a handful of resources for learning). Finally,
it has evolved over the past thirty or so years with input and
interest from the AI community.
There's been talk on this list about translating natural language
with perl. I suggested esperanto or lojban as an intermediary
language. I've done some reading, and found out that others have had
that same idea: http://www.lojban.org/files/why-lojban/mactrans.txt
The probem, of course, is that reliably parsing English or Japanese
is a long way off. The computer doesn't really even know what it's
translating. However, just about every computer on the planet can
parse source code. Perhaps an interesting project would be an
automated translator for computer programs.
Right now, perl can be compiled into C, python can be compiled into
java bytecode or C source. Just about anything can be compiled into
assembly language. In all of these cases, the interpreter chunks
downward, breaking the high level language into low level steps.
There are also some lateral chunkers: programs that translate awk or
sed to perl, for example, or assemblers that convert opcodes into
machine language. For the most part, these are just search-and replace
methods. These work because the conceptual gap between the languages
is not large (at least in one direction). The translation article I
linked above compares this kind of thing to a first year language
student simply looking up words in a translating dictionary and
writing the translation down.
But what about translating a lisp or python program into perl? Or
(even with a perl grammar) doing the opposite? As long as the two
languages is turing-complete, it's possible. It simply requires an
understanding of what the programs are doing. The translator needs to
recognize the algorithm being used, and map that to the other
language. It needs to understand things like recursion, sorting,
function calls, loops, design patterns, and how to simulate them if
translating from an expressive language to a less expressive one, and
how to recognize the workarounds when going from a less expressive
language to a more expressive one.
A universal source code translator would have to "chunk up" and make
comments about what a particuluar program was doing, then chunk back
down into a different language. If an intermediary language were used,
it would have to be expressive enough to handle any statement in any
other language. (Even weird stuff like regexps, or the cut ("!") in
prolog.) ... And perhaps (like a human translator) it might have to
be expressive enough to ask for help.
No current computer language is expressive enough to account for all
the thought forms or experiences available in computer science.
Because they're turing complete, they *CAN* express any particular
operation, but often it's in the manner of someone taking ten
paragraphs to describe a single experience for which one's language
has no word. (Like me, right now! Imagine if there was a single word
that had the exact meaning of this entire message - including this
sentence!) But: all of these concepts could be described in a human
language, such as english, or even lojban.
So, what would lojban buy or cost us as an intermediary language for
translating source code? If someone were to actually implement a
translator like this, would perl be a sensible implementation
language? Why or why not? How might we approach the issue of "chunking
up" and recognizing patterns?
I've rambled enough. :)
------------
[*] Incidentally, I wrote a perl program that will take an RGB color
value and give you an english description. It's the example
program for AI::Fuzzy, and you can find it by grabbing "fuzco"
and AI-Fuzzy-*.tar.gz at http://www.sabren.com/code/perl/
[**] Yes, we can look up the Braves on the net, but the computer
doesn't have any clue who they are. It might not "understand"
a perl script, either, but it reacts as if it understands.
If humans spoke perl, the turing test would be a snap. :)
Cheers,
- Michal
-------------------------------------------------------------------------
http://www.manifestation.com/ http://www.linkwatcher.com/metalog/
-------------------------------------------------------------------------