Next: 4 Illustration: lexicography in
Up: No Title
Previous: 2 Spoken language resources
Two epistemological domains:
-
- SIGNALS: recorded spoken data
-
- SYMBOLS: interpreted spoken data
The SIGNAL-SYMBOL barrier:
-
- HUMAN: categorial, interpretative perception, world knowledge
-
- MACHINE: stochastic segmentation, classification, top-down prediction
Pre-recording, recording post-recording requirements:
- Corpus design: specification of speakers, scenario
- Corpus recording: studio quality
- Corpus processing: physical and linguistic characterisation:
- Physical: signal properties, speaker characteristics
- Linguistic: transcription, annotation, lexicon ...
Dafydd Gibbon
Wed May 22 10:39:25 MET DST 1996