Acoustic Model

List of Acoustic Models

#states#mixtures gender
monophone 129 4, 8, 16 GD, GI
triphone 1000 1000 4, 8, 16 GD
triphone 2000 2000 4, 8, 16 GD, GI
triphone 3000 3000 4, 8, 16 GD
PTM triphone 3000/129 64 GD, GI

List of Japanese Phones

a i u e o a: i: u: e: o: N w y
p py t k ky b by d dy g gy ts ch
m my n ny h hy f s sh z j r ry
q sp silB silE (pauses)

Training...ASJ (Acoustical Society of Japan) databases

20K sentences / 132 speakers for each gender

Acoustic Analysis

A/D 16kHz,16bit
frame shift 10ms
analysis MFCC (12-th order)
CMN done for whole utterance

pattern: MFCC + $\Delta$MFCC + $\Delta$LogPow (25 variables)


left-to-right 3 states (excluding initial & final)
decision tree-based clustering:
    (logical triphone 21000) $\rightarrow$ (physical triphone 8000)

PTM (Phonetic Tied-Mixture) model

next up previous
Next: Lexicon Up: Specification of Modules Previous: Specification of Modules
Tatsuya Kawahara