In addition to formal validation and ergonomic testing (Gibbon et al., 1997), which will take place at a later stage, the CoGesT annotation system ideally needs to be evaluated on two levels:
- inter-annotator consistency
- intra-annotator consistency
As for inter-annotator consistency, the annotations of different annotators can be compared automatically. Due to the limited exactness of segmentation a
threshold defining a certain granularity has to be specified.
The two variables involved are:
- segmentation agreement: two segmentations agree with regard to their number of
segments if the number of segments is the same and/or deviates
systematically. Based on a predefined treshold segments are considered to agree when the time stamps of their start and end points fall into the same interval;
- transcription agreement: two segments agree if their transcriptions match.
Manual annotations show a tendency of having inconsistent deviations
from a systematic annotation. To evaluate intra-annotator consistency (consistency of the annotations of the same set of gestures by a single annotator), the following experiment needs to be conducted:
- a number of single gesture files are generated;
- these single gesture files are presented to the annotator randomly;
- the annotator transcribes every presented gesture;
- the annotations of the same gesture are compared among each other.
Only a basic inter-annotator evaluation of the CoGesT system has been carried out so far. A complete evaluation as described above is in preparation and will be published in a separate document at some later stage.
Thorsten Trippel
2003-08-12