Acoustic Model

List of Acoustic Models

#states#mixtures gender
monophone 129 4, 8, 16 GD, GI
triphone 1000 1000 4, 8, 16 GD
triphone 2000 2000 4, 8, 16 GD, GI
triphone 3000 3000 4, 8, 16 GD
PTM triphone 3000/129 64 GD, GI

List of Japanese Phones

a i u e o a: i: u: e: o: N w y
p py t k ky b by d dy g gy ts ch
m my n ny h hy f s sh z j r ry
q sp silB silE (pauses)

Training...ASJ (Acoustical Society of Japan) databases

20K sentences / 132 speakers for each gender

Acoustic Analysis

A/D 16kHz,16bit
frame shift 10ms
analysis MFCC (12-th order)
CMN done for whole utterance

pattern: MFCC + $\Delta$MFCC + $\Delta$LogPow (25 variables)


left-to-right 3 states (excluding initial & final)
decision tree-based clustering:
    (logical triphone 21000) $\rightarrow$ (physical triphone 8000)

PTM (Phonetic Tied-Mixture) model

