[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[lojban] Re: morphology paper announced
On 2/22/07, Cyril Slobin <slobin@ice.ru> wrote:
I think the idea behind is that recognition algorithm must be as easy as
possible and must not require a complex analysis.
That would be wonderful. Unfortunately, the tosmabru and slinku'i tests
make this almost impossible. You can't recognize tosmabru and slinku'i
failure without some relatively complex analysis.
You argue about {iglu}
because you SEE that {glu} is not a brivla.
If you think of {.} as an ordinary consonant, {.iglu} is just like a gismu.
Nobody mistakes one-syllable
piece of text for a word. What about {adjgadja}? Do you still SEE that
{djgadja} is not a brivla?
Yes, immediately, because {djg} for me is not a valid initial cluster.
But let's consider {.asprapra} instead. This is harder to figure out, because
{spr} is a valid initial, but then {sprapra} fails slinku'i.
If you think of {.} as an ordinary consonant, {.asprapra} is just like a
lujvo {pasprapra}, and {.adjgadja} is just like a lujvo {padjgadja}.
Detecting that the {.a} in {.asprapra} doesn't break off is no harder than
detecting that the {pa} in {pasprapra} doesn't break off.
You suggestion is to cut off an initial cmavo
and then recursively apply word recognition algorithm to the rest.
That's unavoidable. How else do you detect tosmabru failure?
On the other hand, my approach is
to apply a short sequence of simple patterns (specified in brkwords).
I haven't studied brkwords very carefully, but as I pointed out, brkwords
says the string {.iglu} contains a brivla, doesn't it?
mu'o mi'e xorxes
To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.