Training Methods

The most primitive method is to visualize the acoustic features such as waveform, spectrogram, pitch, and power on the screen, and give it as feedback (LEVEL 1). This technique is already in use for clinical purposes, but has several substantial problems. First, for the learners with no knowledge of acoustics, it is absolutely impossible to read and interpret the visualized acoustic information. Second, a human supervisor is usually necessary for analyzing their pronunciation and giving advice. Third, there is no simple correspondence between acoustic features and speech production.

A somewhat improved method is to perform automatic segmentation over the visualized acoustic features to allow the learners be able to compare their pronunciation with the native model without any difficulty (LEVEL 2). This method is helpful for the learners who have no knowledge of acoustics, however they are still unable to make corrections, ``what sound is wrong and what to do to correct it''.

An ideal method is to instruct the learners approximating human teacher instruction (LEVEL 3). More specifically, their pronunciation is evaluated in terms of sound quality, rhythm, accent, and intonation, and then a visual representation of their articulatory behaviors is presented as feedback. This method is thought to be practical and helpful, but many technological problems have been identified which make its implementation difficult.


next up previous contents
Next: Theoretical Basis Up: 1.1.3 Our Approach Previous: 1.1.3 Our Approach

Jo Chul-Ho
Wed Oct 13 17:59:27 JST 1999