Inter-Speaker Interaction in Speech Rhythm

Kuniko KAKITA

Department of Liberal Arts and Sciences, Faculty of Engineering,
Toyama Prefectural University
Kosugi-machi, Imizu-gun, Toyama 939-03, JAPAN
e-mail: kakita@pu-toyama.ac.jp

The ultimate goal of the present study is to see how one speaker's speech rhythm is affected by another speaker's speech rhythm.
Utterance texts consisting of five short sentences were read by five subjects in two different ways. In one condition (Session A), a series of five sentences was read through by a single subject; in the other condition (Session B), the subject took over from the preceding speaker after the third sentence and read the remaining sentences. The durations of the sentences and intersentence intervals following the takeover were compared with those at corresponding locations in single speaker readings. [For details of the experimental design and data recording, see, for example, Kakita 1994, Proc. of the ICSLP 94, 131-134.]
Preliminary results (Kakita, 1994, Juuten-Ryouiki 'Onsei Taiwa' Seika Houkoku, 67-73) showed that both the sentence duration and the interval duration following the takeover (the 'post-takeover' duration) deviated from one's 'preferred' duration obtained from single speaker readings (the 'inherent' duration). The results also revealed that, for all subjects, the deviation was mostly assimilative, and that for both sentences and intervals, the 'post-takeover' duration settled somewhere between the subject's 'inherent' duration and the preceding speaker's 'inherent' duration.
It is extremely interesting that all the subjects' 'post-takeover' durations became closer to the preceding speaker's values but not quite to the extent of completely assimilating to the preceding speaker. This observation leads one to hypothesize that, when the subjects are exposed to the preceding speaker's speech, they reorganize their speech according to two competing criteria, 'their own temporal criterion' and 'the preceding speaker's temporal criterion'. Inter-speaker interaction, then, may tentatively be defined as the reorganization of one's speech with reference to one's own speech production framework, on one hand, and to the speech production framework of another speaker, on the other.
In the present study, the duration of sentences and intervals was reexamined with respect to two different ways of looking at inter-speaker interaction. One was to see 'how close the subjects' durational values came to the preceding speaker's value'. This was called 'approximation', and was defined as the difference between the preceding speaker's 'inherent' duration and the subjects' 'post-takeover' duration. The other way of looking at the result of interaction was to see how much one's 'post-takeover' value deviated from one's 'inherent' value. This was called 'deviation', and was defined as the difference between the subject's 'inherent' duration and the subjects' 'post-takeover' duration. It was argued that if reorganization of sentence and interval duration (under speaker interaction) was governed primarily by 'one's own temporal criterion', one would more likely find invariance in 'deviation', whereas if it was governed primarily by 'the other speaker's criterion', one would more likely find invariance in 'approximation'.
The simple measures of 'approximation' and 'deviation', per se, did not provide any clues as to what was the invariant factor in the reorganization of speech under interaction. However, when each of the two parameters, 'approximation' and 'deviation', was reanalized in relation to 'inherent difference' (the difference between the subject's 'inherent' duration and the preceding speaker's 'inherent' duration), the results showed that sentences and intervals differed characteristically in the way 'approximation' and 'deviation' were correlated with 'inherent difference'. For sentences, high positive correlation was observed both between 'approximation' and 'inherent difference' (r: 0.699*) and between 'deviation' and 'inherent difference' (r: 0.593*). For intervals, however, the correlation between 'approximation' and 'inherent difference' was very high (r: 0.702*) but the correlation between 'deviation' and 'inherent difference' was extremely low (r: 0.044). [*: Significant at the 0.01 level].
The fact that 'approximation' and 'deviation', for sentence duration, manifested equally high correlation with 'inherent difference' was interpreted to suggest that the reorganization of sentence duration was governed both by 'one's own temporal criterion' and by 'the preceding speaker's criterion'. The fact that, for interval duration, the correlation between 'approximation' and 'inherent difference' was high whereas that between 'deviation' and 'inherent difference' was low, was interpreted to suggest that the reorganization of interval duration was governed primarily by 'the preceding speaker's criterion' and that 'one's own criterion' did not play a significant role in the reorganization of interval duration.
It is also interesting that the correlation between 'deviation' and 'inherent difference' was high for sentences but low for intervals. This can be explained in terms of the difference in the production of sentences and intervals. The production of sentences (utterances) involves active movements of the articulators and is also accompanied by aural and tactile feedback. It would be relatively easy to establish and maintain 'one's own temporal criterion' in the presence of inter-speaker interaction. Hence the high correlation. In contrast, during intervals, there are no such 'active' movements or feedback. It would be more difficult to maintain 'one's own temporal criterion', or to put it differently, it would be easier for interval duration to 'deviate' from the 'inherent' value. Hence the low correlation.
Finally, the findings of the present study may be summarized as follows. The duration of the sentences and intervals were reexamined with respect to two different ways of looking at inter-speaker interaction, 'approximation' and 'deviation'. When each of the two parameters was reexamined in relation to 'inherent difference', the results showed that sentences and intervals differed characteristically in the way 'approximation' and 'deviation' were correlated with 'inherent difference'. For sentences, high positive correlation was observed both between 'approximation' and 'inherent difference' and between 'deviation' and 'inherent difference'. For intervals, however, the correlation between 'approximation' and 'inherent difference' was very high but the correlation between 'deviation' and 'inherent difference' was extremely low. These results were interpreted to suggest that the reorganization of sentence duration, on one hand, is governed both by 'one's own temporal criterion' and 'the preceding speaker's criterion', whereas the reorganization of interval duration is governed primarily by 'the preceding speaker's criterion' and that 'one's own criterion' does not play a significant role in determining the degree of interaction in interval production.

Keywords: Inter-speaker interaction, utterance duration, pause duration, temporal framework, speech re-organization