next up previous
Next: English: FreePhone Up: Notes on Diphone Database Previous: MBROLA

Text-to-speech components

The term `Text-to-Speech (TTS)' applies to a speech synthesis system (speech synthesiser) whose input is a stream of characters constituting a text and whose output is a speech signal.

The term TTS also applies to one of the four main components of a TTS synthesiser:

  1. TTS component: Conversion of text and punctuation, with or without additional structural or functional markup, into a phonological representation, generally a string of phonemes associated with prosodic information about length and pitch and possibly also prosodic structure.
  2. Synthesis component: Conversion of the phonological representation into a speech signal representation, for example a digital waveform.
  3. Digital-analog converter (DAC): Conversion of the speech signal representation into an analog signal such as a continuously varying voltage.
  4. Acoustic transducer: Conversion of the analog signal into an acoustic signal via an amplifier and loudspeaker (headphones, etc.).

The term `TTS' will be used in the following discussion to refer to the TTS component, not to the complete TTS synthesiser.





Dafydd Gibbon, Sat Oct 17 18:27:56 CEST 1998