Speech researchers have relied heavily upon spectrum analysis
techniques since the late 1930s, with the invention of the sound
spectrograph; a device that translates a sound into a visual
representation of its component frequencies. It is because each
phoneme is assumed to be distinguishable by its own unique pattern in
the spectrogram. For voiced phonemes, the signature involves large
concentrations of energy called formants, and there is a
characteristic waxing and waning of energy in all frequencies, which is
the most salient characteristic of what we call the human voice.
Below are the spectrographic characteristics of six categories for the
manner of articulation as listed in Table 2.1.
- Plosive - involves an explosive burst of acoustic energy
following a short period of silence, because the vocal tract is
completely blocked just before the sound is produced.
- Nasal - has much less energy than any of the other voiced sounds because
the oral tract is completely blocked, and sound waves radiate
principally from the nose.
- Fricative - is in high-frequency regions which are more random
in energy distribution than voicing, although the voiced fricatives
may have a very low voice.
- Affricate - shows as one or more thin bars to the left of the
large rectangle of frication.
- Tap or Flap - shows formants in low-frequency regions because
the vocal tract is blocked at the roof of the mouth.
- Approximant - has formants which are less pronounced than those
of vowels, because of a slight obstruction placed somewhere along the
vocal tract.
Through a spectrogram, it is possible to see some differences that
were not seen in the waveforms. Furthermore, there are also a number
of features observable on spectrograms that indicate a speaker's
individual speech habits and are not language dependent.
Next: Formants and Vowels
Up: 2.2.4 Spectrogram and Formant
Previous: 2.2.4 Spectrogram and Formant
Jo Chul-Ho
Wed Oct 13 17:59:27 JST 1999