Next: 9 Computer readable phonemic
Up: No Title
Previous: 7 Transcription and Labelling
Orthographic transcription is used for large scale spoken language corpus work:
Computationally oriented projects pose the following requirements:
-
- Orthography is lexical, canonical lexical forms are preferred.
-
- Spelling and standard pronunciation are related (in pronunciation tables or by grapheme-phoneme rules).
-
- Non-standard vocabulary items (noises, hesitation phenomena, fragmentation) need extensions.
-
- Prosody needs extensions.
-
- Non-standard pronunciation needs extensions (e.g. comments).
-
- Uncertain identification (e.g. comments, comment marks).
For computation, `modified orthography' is not suitable for representing non-canonical pronunciation (style, dialect, social class).
A formal mapping to computational conventions is needed for interpretative systems (CHILDES, HIAT, Selting ...).
Dafydd Gibbon
Wed May 22 10:39:25 MET DST 1996