In our system, a mora duration is relatively measured by the ratio of
its length to the word's length, not absolute length by
millisecond. In related works, generally, a mora duration is
statistically measured in millisecond (msec)[36]. But Dauer
(1983) claimed that people tend to perform rhythmic tasks within a
limited time at their own preferred rate[32]. It is believed
that mora rhythms are perceived by the relative duration of each mora
within a word, not the absolute one. Accordingly, we measured the
ratio of mora length to the word length, , as follows:
In Figure 5.9, it is more concretely shown by using an
example: most native speakers show the same V pattern rhythm even if
their own millisecond durations are a little different one another
(left) and their absolute gaps are diminished by a relative
measurement (right). Furthermore, we investigated its effectiveness
for the ATR database. We calculated the ratio of standard deviations
to means ( ) with both a millisecond and a relative ratio
based on the calculation:
As shown in Figure 5.10, we found out that the distribution of ratio is more normalized than that of millisecond, which supported our method.
図 5.9: Absolute measurement vs. relative measurement over segment durations
図 5.10: Comparison between relative measurement and absolute measurement