[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: lojban newbie: an outsider looking in
- Subject: Re: lojban newbie: an outsider looking in
- From: "Michal Wallace (sabren)" <sabren@manifestation.com>
- Date: Fri, 29 Oct 1999 02:10:41 -0400 (EDT)
On Thu, 28 Oct 1999, Richard Curnow wrote:
> > I think I might be able to manage a crappy sentence with a lot
> > of parentheses on my own.. And probably do a lot better after
> > studying YACC.
>
> Given all the talk of parenthesis, I guess it's my jbofi'e program being
> discussed.
Well, *I* was talking about a hypothetical parser of my own.
I've seen your source, but haven't been able to get it to compile..
In reading about the ins and outs of parsing, I found a program
that does for perl what yacc/bison does for c. So I ran that against
the LLG grammar and produced a parser.
Unfortunately, I don't have a lexer, and I'm not really sure how to
create one.. I tried looking at your code, but the only lex file you
seem to have is for stripping out comments... (???) (without being
able to compile it, I can't run the perl debugger to step through
it..) I know less about parsing than I do about lojban.. but don't
you have to have a lexer in there somewhere?
Also, did you create your own version of the grammar?
(I'm still trying to make sense of the 6 steps listed in the LLG's
grammar)
> To set the record straight, my aims were not so ambitious as to include
> the production of reasonable English sentences as direct translations.
I hope you didn't think I was slamming your code! I was just imagining
what *my* parser might do, and I figured it would look mostly like
english but with parentheses around stuff like translated tanru.. hence,
crappy english with a bunch of parentheses.. :)
> 1. To check syntax and subtle errors of meaning in Lojban I wrote myself.
> 2. To do the drudgery of looking up words in the dictionary when reading
> other people's Lojban posted to this list.
> Goal (2) often fails to be realised, because so much published Lojban does
> not in fact have the correct syntax :-(
Perhaps a validating parser ought to be incorporated into a training
program...
> 3. To mark the sumti with a reminder of their meaning within the
> definition of the main selbri within the sentence, to assist in the
> interpretation once the individual words have been translated.
I like this idea very much. Looking at the html output on your
site, I think it might be somewhat easier on the eyes if it could
be done in multiple lines.. (lojban on one line, "engloj" on the
next..)
> Once you get used to the output format, it is a useful 'half-way house'
> between the raw Lojban and a smooth English translation.
That's pretty much my goal as well.
> I'd bet that if you try doing a Lojban -> English translator more akin to
> Babelfish, you'll find the front-end YACC/Bison stuff a piece of cake
> compared to the back-end task of producing reasonable English with the
> same meaning. OK, you can probably do quite well for a limited range of
> Lojban constructions, but I'm sure the task is very hard indeed in its
> full generality. One particularly dull aspect that I've jibbed at doing
> is re-formulating all the gismu definitions depending on which of the
> places has the 'focus' in the sentence (i.e. which of se, te etc precede
> the selbri).
> The parentheses in jbofi'e's output, incidentally, show how various
> constructions nest within each other. It's useful for seeing the nesting
> order of connectives, the binding order of the brivla within tanru etc.
> I agree they're a bit of a distraction as well, but they are useful for
> checking subtle details of meaning that differ depending on binding order.
That's the other reason I was envisioning a bunch of parentheses in my
version.. :)
> > part of the reason I suggested a perl version was because the dictionary
> > looks like: "x1 eats x2 ..." and it would be fairly easy, once you know
> > which sumti are x1 and x2, to do a search and replace (at least with
> > perl's regular expressions)..
>
> Fine for 'citka', however you need to generate templates for
>
> se citka : x1 is eaten by x2
> ctigau : x1 feeds x2 with x3
> te se ctigau : x1 feeds x2 to x3
> citka jubme : dining table
> ctijbu : ditto
>
> and so on, ad infinitum. The construction of this template dictionary is
> the really dreary part of the project.
Yeah. But I think that's overkill at the moment. I'd be perfectly happy
with:
se citka : x2 eats x1
ctigau : x1 {to feed} x2 # since that's what's in the dictionary
te se ctigau : ???? # I don't yet understand how "te" and "se" interact
citka jubme : [eat-table] # since it's just a tanru
ctijbu : [eat-table] # since it isn't in the lujvo dictionary
ctijbu : dining table # if it were
[] might designate tanru
{} might designate unconjugated english.
for past tense, I'd even be happy with: x1 {past: eats} x2
At some point in the future, someone who wasn't me :) could do
all the messy work of creating hashes or tables to conjugate
the english verbs, pluralize nouns, etc..
Cheers,
- Michal
-------------------------------------------------------------------------
http://www.manifestation.com/ http://www.linkwatcher.com/metalog/
-------------------------------------------------------------------------