Next: Transcription
Up: Class notes on Diphone
Previous: Secondary recording scenario
For further processing in the post-recording phase, the speech signal
was transferred from DAT recorder to PC (Win95 and Linux).
The original intention was to use a ZA2 digital input card, but since
this was not yet available, analog transfer was used.
Noise and distortion appeared to be minimal.
The signal was initially divided into one file per block,
and intervening material was zeroed and shortened.
The software used was CoolEdit 96 (registered full functionality version).
For diphone database construction, the post-recording phase
is divided into the following sub-phases:
- Transcription: since the texts were read aloud, the original material was converted with grapheme-phoneme phoneme conversion techniques (manual, semi-automatic).
- Segmentation: Phone boundaries were marked.
- Labelling / annotation: Phones were labelled with SAMPA machine readable symbols.
- Diphone division, extraction, database construction: the marking of the phone `centre' points, diphone extraction (with max. length 500 msec per diphone), is performed with a script designed to feed into the MBROLA acquisition format.
Dafydd Gibbon, Mon Dec 21 10:23:16 CET 1998