3.5 Pronunciation Error Detection using Statistical Threshold Scores

Comparison with native speakers would be necessary to establish norm ranges that are required for detecting a learner's error. The scores were computed using native acoustic models by means of automatic scoring, and they represent the degree of match between the non-native speech and the native models: the higher the score is, the better the speech fits the phoneme models and the better the perceived quality is[23]. However, in order to automatically detect pronunciation errors, the degree of match between them should be defined. Therefore, we investigate several techniques for automatically detecting pronunciation error. Such techniques are usually based on empirically derived thresholds on the native speakers' scores. At this point, some common variations are possible according to the use of one native model speaker or a group of native speakers. Some researchers have insisted that an announcer, who was generally trained with standard language, was desirable for the definition of norm-range scores. In the experiments, we have tried several possible methods for each task. The effectiveness of these techniques can be evaluated based on its correlation with human judgement for the training speech of non-native learners.


next up previous contents
Next: 3.6 Experiments Up: 3 Automatic Pronunciation Assessment Previous: 3.4.2 Scoring

Jo Chul-Ho
Wed Oct 13 17:59:27 JST 1999