Inter-Speaker Interaction in Speech Rhythm
Kuniko KAKITA
Department of Liberal Arts and Sciences, Faculty of Engineering,
Toyama Prefectural University
Kosugi-machi, Imizu-gun, Toyama 939-03, JAPAN
e-mail: kakita@pu-toyama.ac.jp
The ultimate goal of the present study is to see how one speaker's
speech rhythm is affected by another speaker's speech rhythm.
Utterance texts consisting of five short sentences were read by five
subjects in two different ways. In one condition (Session A), a series
of five sentences was read through by a single subject; in the other
condition (Session B), the subject took over from the preceding
speaker after the third sentence and read the remaining sentences. The
durations of the sentences and intersentence intervals following the
takeover were compared with those at corresponding locations in single
speaker readings. [For details of the experimental design and data
recording, see, for example, Kakita 1994, Proc. of the ICSLP 94,
131-134.]
Preliminary results (Kakita, 1994, Juuten-Ryouiki 'Onsei Taiwa' Seika
Houkoku, 67-73) showed that both the sentence duration and the
interval duration following the takeover (the 'post-takeover'
duration) deviated from one's 'preferred' duration obtained from
single speaker readings (the 'inherent' duration). The results also
revealed that, for all subjects, the deviation was mostly
assimilative, and that for both sentences and intervals, the
'post-takeover' duration settled somewhere between the subject's
'inherent' duration and the preceding speaker's 'inherent' duration.
It is extremely interesting that all the subjects' 'post-takeover'
durations became closer to the preceding speaker's values but not
quite to the extent of completely assimilating to the preceding
speaker. This observation leads one to hypothesize that, when the
subjects are exposed to the preceding speaker's speech, they
reorganize their speech according to two competing criteria, 'their
own temporal criterion' and 'the preceding speaker's temporal
criterion'. Inter-speaker interaction, then, may tentatively be
defined as the reorganization of one's speech with reference to one's
own speech production framework, on one hand, and to the speech
production framework of another speaker, on the other.
In the present study, the duration of sentences and intervals was
reexamined with respect to two different ways of looking at
inter-speaker interaction. One was to see 'how close the subjects'
durational values came to the preceding speaker's value'. This was
called 'approximation', and was defined as the difference between the
preceding speaker's 'inherent' duration and the subjects'
'post-takeover' duration. The other way of looking at the result of
interaction was to see how much one's 'post-takeover' value deviated
from one's 'inherent' value. This was called 'deviation', and was
defined as the difference between the subject's 'inherent' duration
and the subjects' 'post-takeover' duration. It was argued that if
reorganization of sentence and interval duration (under speaker
interaction) was governed primarily by 'one's own temporal criterion',
one would more likely find invariance in 'deviation', whereas if it
was governed primarily by 'the other speaker's criterion', one would
more likely find invariance in 'approximation'.
The simple measures of 'approximation' and 'deviation', per se, did
not provide any clues as to what was the invariant factor in the
reorganization of speech under interaction. However, when each of the
two parameters, 'approximation' and 'deviation', was reanalized in
relation to 'inherent difference' (the difference between the
subject's 'inherent' duration and the preceding speaker's 'inherent'
duration), the results showed that sentences and intervals differed
characteristically in the way 'approximation' and 'deviation' were
correlated with 'inherent difference'. For sentences, high positive
correlation was observed both between 'approximation' and 'inherent
difference' (r: 0.699*) and between 'deviation' and 'inherent
difference' (r: 0.593*). For intervals, however, the correlation
between 'approximation' and 'inherent difference' was very high (r:
0.702*) but the correlation between 'deviation' and 'inherent
difference' was extremely low (r: 0.044). [*: Significant at the 0.01
level].
The fact that 'approximation' and 'deviation', for sentence duration,
manifested equally high correlation with 'inherent difference' was
interpreted to suggest that the reorganization of sentence duration
was governed both by 'one's own temporal criterion' and by 'the
preceding speaker's criterion'. The fact that, for interval duration,
the correlation between 'approximation' and 'inherent difference' was
high whereas that between 'deviation' and 'inherent difference' was
low, was interpreted to suggest that the reorganization of interval
duration was governed primarily by 'the preceding speaker's criterion'
and that 'one's own criterion' did not play a significant role in the
reorganization of interval duration.
It is also interesting that the correlation between 'deviation' and
'inherent difference' was high for sentences but low for intervals.
This can be explained in terms of the difference in the production of
sentences and intervals. The production of sentences (utterances)
involves active movements of the articulators and is also accompanied
by aural and tactile feedback. It would be relatively easy to
establish and maintain 'one's own temporal criterion' in the presence
of inter-speaker interaction. Hence the high correlation. In contrast,
during intervals, there are no such 'active' movements or feedback. It
would be more difficult to maintain 'one's own temporal criterion', or
to put it differently, it would be easier for interval duration to
'deviate' from the 'inherent' value. Hence the low correlation.
Finally, the findings of the present study may be summarized as
follows. The duration of the sentences and intervals were reexamined
with respect to two different ways of looking at inter-speaker
interaction, 'approximation' and 'deviation'. When each of the two
parameters was reexamined in relation to 'inherent difference', the
results showed that sentences and intervals differed
characteristically in the way 'approximation' and 'deviation' were
correlated with 'inherent difference'. For sentences, high positive
correlation was observed both between 'approximation' and 'inherent
difference' and between 'deviation' and 'inherent difference'. For
intervals, however, the correlation between 'approximation' and
'inherent difference' was very high but the correlation between
'deviation' and 'inherent difference' was extremely low. These results
were interpreted to suggest that the reorganization of sentence
duration, on one hand, is governed both by 'one's own temporal
criterion' and 'the preceding speaker's criterion', whereas the
reorganization of interval duration is governed primarily by 'the
preceding speaker's criterion' and that 'one's own criterion' does not
play a significant role in determining the degree of interaction in
interval production.
Keywords: Inter-speaker interaction, utterance duration, pause duration, temporal framework, speech re-organization