2.3.2 Hidden Markov Models (HMMs)

The most flexible and successful approach to speech recognition so far has been Hidden Markov Models (HMMs). In this section we will present the basic concepts of HMMs, describe the algorithms for training and using them. A HMM is a collection of states connected by transitions, as illustrated in Figure 2.8. It begins in a designated initial state. In each discrete time step, a transition is taken into a new state, and then one output symbol is generated in that state. The choice of transition and output symbol are both random, governed by probability distributions. The HMM can be thought of as a black box, where the sequence of output symbols generated over time is observable, but the sequence of states visited over time is hidden from view.

Formally, an HMM consists of the following elements:

A set of states
A set of transition probabilities, where is the probability of taking the transition from state i to state j
A set of emission probabilities, where is the probability distribution over the acoustic space describing the likelihood of emitting each possible sound u while in state i

Since a and b are both probabilities, they must satisfy the following properties:

These formulae limit the number of trainable parameters and make the training and testing algorithms very efficient, rendering HMMs useful for speech recognition. There are basic algorithms associated with Hidden Markov Models.

Next: Scoring (The Forward Algorithm) Up: 2.3 Speech Recognition Methods Previous: 2.3.1 Signal Processing

Jo Chul-Ho
Wed Oct 13 17:59:27 JST 1999