next up previous contents
Next: 7 Spectral analysis: the Up: 6 Signal processing in Previous: Convolution: filtering

PSOLA: frequency and phase modulation

If someone speaks `monotonously', this means that we have the impression that the voice has no recognisable `melody', that is, the pitch of the voice does not change. In normal natural speech this is not the case, however. The perceived pitch of the voice (which relates closely to the fundamental frequency of the signal) changes. Fundamental frequency changes (generally not perceivable as pitch changes) are also caused by properties of speech production: obstruents cause rapid (and sometimes very large) variations in tex2html_wrap_inline30340 (sometimes known as pitch perturbation), and vowels have been shown to be associated with different relative pitch changes in careful speech. But languages can be assigned to different typological categories, depending on the linguistic function of pitch in these languages:

  1. Pitch accent languages. Pitch accents are peak-shaped departures of pitch from a general melodic line, with a duration of approximately the length of one syllable, which occur regularly on specific syllables in words, and may distinguish between words (Swedish, Japanese) as `distinctive features'.
  2. Tone languages. Pitch patterns (pitch contours or sequences of pitch height values) may distinguish between words in a number of ways: lexically (as `distinctive features' or morphologically (with specific meanings in inflection, derivation and compounding).
  3. Intonation languages. Pitch patterns are associated with sequences of words (phrases, sentences, turns in dialogue, depending on speaking style), and function in structuring dialogue and conveying speaker intentions.

The perceived variation in speech melody or pitch relates to changes in the fundamental frequency of the signal, i.e. to frequency modulation of the signal by another signal. In the context of speech production, the source signal is modulated in frequency.

Frequency modulation (or phase modulation, which is closely related) is achieved by adding some proportion of the amplitude of the modulation signal to the frequency or the phase component of the carrier signal. In standard signal processing contexts, the carrier signal is a sinusoid; this is not the case with speech signals, however, which approximate to sawtooth waveforms. The Figure was calculated with a phase modulation operation.

   figure3304
Figure: Sine wave (high frequency).

   figure4513
Figure: Sine wave (low frequency).

   figure27471
Figure 20: Frequency (or phase) modulated signal.

PSOLA is a modulation operation which allows the signal to be re-modulated, with certain restrictions, by a different fundamental frequency.


next up previous contents
Next: 7 Spectral analysis: the Up: 6 Signal processing in Previous: Convolution: filtering

Dafydd Gibbon
Tue May 7 11:44:01 MET DST 1996