Continuous Speech

Isolated speech recognition is relatively easy because word boundaries are detectable and the words tend to be cleanly pronounced, whereas continuous speech is more difficult because word boundaries are unclear and their pronunciations are more corrupted by coarticulation. The Nozomi decoder (M.Schuster at ATR, 1998) was tested on a Japanese Newspaper Dictation Task, which consists of continuous speeches, using a 5000 word vocabulary and more than 95% word accuracy was achieved in nearly realtime on a 300 Mhz Pentium II[11].

Substantial progress has been made every year in the basic technology, toward the lowering of barriers to large vocabularies, speaker independence, and continuous speech.

Next: 1.2.2 Multi-Media Up: 1.2.1 Speech Recognition Previous: Speaker Independence

Jo Chul-Ho
Wed Oct 13 17:59:27 JST 1999