[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[lojban] Re: Wheels in my Head



Unless someone else has a really bright idea, I'm going to abandon
this pursuit in favor of a more UTF-8-like approach: use one byte for
simple syllables, two bytes for normal syllables, and three bytes for
complex syllables, reserving a few bits in the first byte to tell how
many bytes follow.

First bits:
0 - 1 byte, from 00 to 7F
10 - 2 bytes from 8000 to BFFF
110 - three bytes from C00000 to DFFFFF

There are then 0x204080 (over two million) possible numbers to assign
to syllables. Actually, because less than 50,000 of them are valid, a
16-bit syllable table could be directly formed by ordering valid
Lojban syllables according to this numbering system and assigning them
a new number based on their place in the order.

On 9/19/05, Jorge Llambías <jjllambias@gmail.com> wrote:
> On 9/19/05, Brandon Wirick <brandon@yrick.com> wrote:
> > Wait, what about {.uAcintyn.}? I don't see any convention for
> > diphthongs that start with {ibu} or {ubu}.
> 
> That's why I said for the most complex type of syllable,
> which are the most numerous. (Maybe complex is not
> the right word because they can be quite simple. The
> most common type, perhaps.)
> 
> That doesn't cover consonantal syllables, syllables
> with affricate onset (tca, tsen, djau, dzoi), syllables with
> semiconsonant onset (ua, bie, niai) and syllables with
> an apostrophe onset ('a, 'ik, 'ei, 'aub). These should be
> somehow squeezed in the holes left by the general
> system.
> 
> mu'o mi'e xorxes
> 
> >
> > On 9/19/05, Jorge Llambías <jjllambias@gmail.com> wrote:
> > > On 9/19/05, Brandon Wirick <brandon@yrick.com> wrote:
> > > > This is great! I had no idea such work existed. My task will be
> > > > difficult, however, to sensibly encode these syllables into sixteen
> > > > bits.
> > >
> > > Dificult, yes. With 17 bits, the most complex type could be
> > > encoded as:
> > >
> > > 1 bit: stressed, unstressed
> > > 1 bit: voiced onset, unvoiced onset
> > > 2 bits: -, c/j, s/z
> > > 3 bits: -, p/b, k/g, t/d, f/v, x, m, n
> > > 2 bits: -, l, r
> > > 4 bits: a, e, i, o, u, ai, au, ei, oi, y
> > > 4 bits: -, c/j, s/z, p/b, k/g, t/d, f/v, x, m, n, l, r
> > >
> > > (the voicedness of the coda is determined by the
> > > voiceness of the following syllable, not by that of the onset,
> > > obviously.)
> > >
> > > Then the other types of syllables, which are simpler, can
> > > be encoded in the holes left by this scheme. But 16 bits...
> > > it seems hard.
> > >
> > > mu'o mi'e xorxes
> > >
> > >
> > > To unsubscribe from this list, send mail to lojban-list-request@lojban.org
> > > with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
> > > you're really stuck, send mail to secretary@lojban.org for help.
> > >
> > >
> >
> >
> > To unsubscribe from this list, send mail to lojban-list-request@lojban.org
> > with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
> > you're really stuck, send mail to secretary@lojban.org for help.
> >
> >
> 
> 
> To unsubscribe from this list, send mail to lojban-list-request@lojban.org
> with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
> you're really stuck, send mail to secretary@lojban.org for help.
> 
>


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.