One of the development goals of the CAPL system is to integrate a range of resources and expertise from speech recognition to pedagogical design in order to afford automatic pronunciation learning. Another goal is to exploit state-of-the-art advances in speech recognition technology for the purpose of automatic pronunciation scoring of learner's speech, which provides feedback validated by human expert raters. The schematic diagram of the CAPL is illustrated in Figure 2.11.
図 2.11: Schematic diagram of the CAPL system
The system was constructed consisting of several modules for training two aspects of pronunciation: segmental part and suprasegmental part. Our emphasis has been on segmental rather than suprasegmental domain. This does not mean that the segmental problems are more important, only that they are theoretically better well-established, whereas extensive work is required on the suprasegemntal aspects. The system includes traditional training methods as discussed in section 1.1.3, and furthermore affords the novel methods that we will discuss it in details in the next three chapters: automatic pronunciation scoring and error detection (chapter 3), articulation instruction (chapter 4), and speech rhythm measurement and error detection (chapter 5), respectively.
The non-native learners of Japanese involved in the experiment are listed in Table 2.3. The learners were selected according to different sets of variables, such as language background, age, and gender. Also, for their speeches, the experiment was performed under the acoustic feature extraction conditions shown in Table 2.4.
ID | M/F | AGE | NATIVE-BORN COUNTRY | JAPANESE |
A | M | 24 | Korea | 1 YR. |
B | M | 28 | China | 1 YR. |
C | M | 26 | Taiwan | 1 YR. |
D | M | 26 | France | 3 YRS. |
E | M | 28 | Canada | 2 YRS. |
F | M | 19 | Kazakstan | 4 YRS. |
G | M | 34 | Indonesia | 3 YRS. |
H | M | 36 | Kenya | 3 YRS. |
(JAPANESE means how long they have been learning Japanese as L2.)
SAMPLING FREQUENCY | 16 kHz, 16 Bits | |
WINDOW LENGTH, TYPE | 25 msec, Hamming | |
FRAME PERIOD | 10 msec |