The following sections perform a preliminary analysis on the CoGesT 1.0 transcription system using the following criteria:
An encouraging results of preliminary evaluations of the CoGest 1.0 conventions as applied in actual annotation work is that vector structure is in general consistently used. However, certain inconsistencies occur, indicating that optimisation both for ergonomic and computational reasons are required.
In particular, inconsistencies of length (due to optionalities in CoGesT 1.0) are found. Most conspicuously, the static 3-place vectors contrast with the dynamic 11-place vectors, and other lengths others also occur in practice. As already noted, the vector position denoting gesture direction constitutes a 1-place, 2-place or 3-place subvector with the separator ``/''.
Two kinds of inconsistency are found in the practical use of this subvector:
In two cases, feature values are not concerned with properties of gestures but with whole gestures.
For example the complex gesture represented as the vector pair with 11 elements in each element of the pair
Without this iteration specification, therefore, the structure of the gesture vector can be simplified still further to
In two cases, default assumptions are used in order to shorten the vector:
The first of these cases, the distinction between static and dynamic gestures, is the main case. The static gesture vectors contain 3 positions:
Dynamic gesture vectors contain 11 positions:
Of these 11 positions, 2 are shared with static gestures (as source), and 8 are concerned with movement trajectory (6 positions) and target (2 positions), while the last is concerned with relations between paired gestures.
In the case of the static gesture, the only two positions required correspond to the source positions of the dynamic gesture. The third is the position concerned with relations between paired gestures.
However, the source position can be interpreted quite naturally as identical with the target position in the case of static holds or postures. Consequently, the trajectory properties can consequently be specified as null. This analysis suggests that in the interests of formal simplification the static gesture vector can be spelled out into a vector of the same length as the dynamic vector.
In the case of the variants of the subvector for trajectory direction, all variants in ordering and length can be spelled out into a full direction subvector of length 3, and a normalised order of elements can be defined.
It is important to note that the trajectory subvector specifies movements, i.e. position translation functions, each of which effectively maps one value of a conventional 3-position spatial vector on to another (or possibly the same) value of another (or possibly the same) spatial vector.
Figure 1 shows the informal grouping proposal for a revised CoGesT 1.0 vector. The three-dot sequences signify ``further, possible identical subtree sequences'', of arbitrary length in the case of Compound Gestures, of length 1 in the case of Gesture Pairs. The connecting lines in the tree signify ``part-whole relations'' between mother node and daughter nodes, as well as a left-right relation between daughter nodes. In the case of static gestures, the Route node is omitted. The trajectory subvector may be replaced by a subvector of iterated microgestures.
The flat CoGesT 1.0 11-place vector for dynamic gestures, or the 9-place
simplified vector proposed here, contains elements which are related to
different degrees and which can be grouped into a hierarchy of
subvectors according to these degrees of relatedness.
For example, the elements of the trajectory:
The grouping is visualised in Figure 1.
The restructured vector for the first half is shown as follows:
One case of redundancy has already been dealt with: the right and left designations of objects in the CoGesT 1.0 vector are redundant if the gesture pair is always represented.
Likewise, the representation of both source and target coordinates and a direction vector is redundant, as the direction vector can be calculated from the source and target coordinates. This can be shown by regarding the direction coordinates as functions mapping the Source position to the Target position. In a 3-dimensional model, the Source and Target positions would each be modelled by a 3-place spatial coordinate vector. The relation between a position vector and a direction vector can then be described as follows:
Since the function can be reconstructed from the extensional specification in terms of Source and Target vectors, it is, strictly speaking, redundant. Alternatively, given the Source and Direction information, the Target information is redundant. Nevertheless, the redundancy may be desirable, and the smaller amount of absolute spatial detail provided by the Direction vector alone may be sufficient to characterise a gesture if a degree of fuzziness can be tolerated. there is currently one restriction here, however: The spatial model underlying CogesT 1.0 is 2-dimensional - front-back distance from the body in the sagittal plane is not yet included. Specifically, the source and target coordinates in CoGesT 1.0 do not contain values for the sagittal (front-back) dimension and therefore, strictly speaking, there is no redundancy in CoGesT in respect of this dimension.
However, the sagittal dimension is required on independent grounds, in order to be able to cope with a scale which ranges from touching to various distances from reference points on the torso. This distance dimension is, incidentally, also required for arbitrary limb pairs involving, for example, hand-clapping, hands resting on the lap, etc., versus non-contact gestures involving two limbs.
Thorsten Trippel 2003-06-30