Acoustic Model


List of Acoustic Models

#states#mixtures gender
monophone 129 4, 8, 16 GD, GI
triphone 1000 1000 4, 8, 16 GD
triphone 2000 2000 4, 8, 16 GD, GI
triphone 3000 3000 4, 8, 16 GD
PTM triphone 3000/129 64 GD, GI


List of Japanese Phones

a i u e o a: i: u: e: o: N w y
p py t k ky b by d dy g gy ts ch
m my n ny h hy f s sh z j r ry
q sp silB silE (pauses)


Training...ASJ (Acoustical Society of Japan) databases

20K sentences / 132 speakers for each gender


Acoustic Analysis

A/D 16kHz,16bit
frame shift 10ms
analysis MFCC (12-th order)
LogPow
CMN done for whole utterance

pattern: MFCC + $\Delta$MFCC + $\Delta$LogPow (25 variables)


HMM

left-to-right 3 states (excluding initial & final)
decision tree-based clustering:
    (logical triphone 21000) $\rightarrow$ (physical triphone 8000)


PTM (Phonetic Tied-Mixture) model


next up previous
Next: Lexicon Up: Specification of Modules Previous: Specification of Modules
Tatsuya Kawahara
5/31/2000