[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lojban] Re: Lojban Text to Speech



On Fri, Jan 04, 2002 at 10:13:48PM -0000, buzzwyrd wrote:
> --- In lojban@y..., Pierre Abbat <phma@w...> wrote:
> 
> http://www-2.cs.cmu.edu/~lenzo/areas/papers/festvox/festvox_toc.html

I've looked at this now - it's pretty informative.

For those who don't have the time to read it, I'll summarize some of the
points I picked up.

* Text-to-speech is a process that has a whole lot of steps involving
  various levels of language. Fortunately, Lojban makes many of these
  steps unnecessary.
* Anyone who plans to record a diphone set is going to need some fairly
  good recording equipment, such as a head-mounted mic and a good sound
  card. Background noise will be quite a problem, so record in a quiet
  place and don't let the mic pick up the computer fan.
* Recording the diphones involves saying a list of nonsense words like
  'tababa', 'tacaca', 'tadada', etc. in a monotone voice, and then the
  appropriate diphones are picked out from the middle syllable of each
  word. (So 'tababa' would yield 'ba' and 'ab'.)
* Sorting out the diphones from the recorded words is a job which
  requires about 20 hours of time, but is so repetitive that it can only
  realistically be done for an hour at a time.

Given how they define 'diphone', I believe that, including the trivial
silent diphone they call 'pau-pau', there are exactly 500 diphones in
Lojban. (Quite a neat coincidence!) While other languages require
recording bizarre consonant combinations because they might come up
between words, we don't have to worry about that in Lojban.

So making a speaker should be significantly easier than they describe it
for other languages, but it's still going to take some time.


Here's the 500 diphones, given in Lojban spelling:

..

.a ba ca da fa ga ja ka la ma na pa ra sa ta va xa za 'a
.e be ce de fe ge je ke le me ne pe re se te ve xe ze 'e
.i bi ci di fi gi ji ki li mi ni pi ri si ti vi xi zi 'i
.o bo co do fo go jo ko lo mo no po ro so to vo xo zo 'o
.u bu cu du fu gu ju ku lu mu nu pu ru su tu vu xu zu 'u
.y by cy dy fy gy jy ky ly my ny py ry sy ty vy xy zy 'y
   b. c. d. f. g. j. k. l. m. n. p. r. s. t. v. x. z.

a. ab ac ad af ag aj ak al am an ap ar as at av ax az a'
e. eb ec ed ef eg ej ek el em en ep er es et ev ex ez e'
i. ib ic id if ig ij ik il im in ip ir is it iv ix iz i'
o. ob oc od of og oj ok ol om on op or os ot ov ox oz o'
u. ub uc ud uf ug uj uk ul um un up ur us ut uv ux uz u'
y. yb yc yd yf yg yj yk yl ym yn yp yr ys yt yv yx yz y'
   .b .c .d .f .g .j .k .l .m .n .p .r .s .t .v .x .z

ai au ei ia ie ii io iu iy oi ua ue ui uo uu uy

a,e a,i a,o a,u a,y
e,a e,i e,o e,u e,y
i,a i,e i,o i,u i,y
o,a o,e o,i o,u o,y
u,a u,e u,i u,o u,y
y,a y,e y,i y,o y,u

bd bg bj bv bz
db dg dj dv dz
gb gd gj gv gz
jb jd jg jv
vb vd vg vj vz
zb zd zg zv

cf ck cp ct
fc fk fp fs ft fx
kc kf kp ks kt
pc pf pk ps pt px
sf sk sp st sx
tc tf tk tp ts tx
xf xp xs xt

bl cl dl fl gl jl kl ml nl pl rl sl tl vl xl zl
bm cm dm fm gm jm km lm nm pm rm sm tm vm xm zm
bn cn dn fn gn jn kn ln mn pn rn sn tn vn xn zn
br cr dr fr gr jr kr lr mr nr pr sr tr vr xr zr

lb lc ld lf lg lj lk lm ln lp lr ls lt lv lx lz
mb mc md mf mg mj mk ml mn mp mr ms mt mv mx
nb nc nd nf ng nj nk nl nm np nr ns nt nv nx nz
rb rc rd rf rg rj rk rl rm rn rp rs rt rv rx rz

-- 
Rob Speer