Subsections

2.2 Preliminary formal analysis of CoGesT 1.0

2.2.1 Criteria for formal reconstruction

The following sections perform a preliminary analysis on the CoGesT 1.0 transcription system using the following criteria:

  1. consistency of vector structure,
  2. homogeneity of vector value types,
  3. spelling out defaults,
  4. grouping of related vector values into a hierarchical structure,
  5. analysis of redundancies,
  6. compositional simplicity.

2.2.2 Consistency of vector structure

An encouraging results of preliminary evaluations of the CoGest 1.0 conventions as applied in actual annotation work is that vector structure is in general consistently used. However, certain inconsistencies occur, indicating that optimisation both for ergonomic and computational reasons are required.

In particular, inconsistencies of length (due to optionalities in CoGesT 1.0) are found. Most conspicuously, the static 3-place vectors contrast with the dynamic 11-place vectors, and other lengths others also occur in practice. As already noted, the vector position denoting gesture direction constitutes a 1-place, 2-place or 3-place subvector with the separator ``/''.

Two kinds of inconsistency are found in the practical use of this subvector:

  1. Inconsistent length of the vector, with examples of lengths 1, 2 and 3: ri, up/le, ba/le/do.
  2. Inconsistent ordering of the vector elements: up/le, le/up.

2.2.3 Homogeneity of vector value types

In two cases, feature values are not concerned with properties of gestures but with whole gestures.

  1. The final position, with the values sy, pa, rp, lp, does not contain values of the same type as the other positions:

    1. The values rp, lp designate the body members which have the properties described by the other values in the vector.
    2. Since the ``;'' operator is also used to make this distinction, the symbols are in principle redundant (though redundancy may be heuristically useful, of course) and the simplex vector sizes can be reduced to 2 or 10 respectively, if the left and right hand gestures are represented explicitly with the ``;'' operator.
    3. The values sy, pa designate entire gesture vectors pertaining to the second limb of a pair where the trajectory has the same description but is mirrored (sy) or is the same but spatially offset (``parallel'') in some direction (pa).
    4. Effectively, the elements sy and pa are abbreviation conventions for concurrent gestures with left and right body member objects (generally hands), which is otherwise expressed by ``;'', in the special case in which there is a simple relationship between the gestures: pa expresses identity (except for local offset) of the two gestures, and sy expresses mirroring of the trajectory direction. Consequently, the symbols sy and pa can be omitted, and the two gestures are fully spelled out, separated by ``;''. For the purpose of human transcription and annotation, the symbols can therefore be regarded as ``macros'' which abbreviate the reduplicated and possibly mirrored gesture.

    For example the complex gesture represented as the vector pair with 11 elements in each element of the pair


    [12rr,5A,do/ba,ar,5A,m,r(0),me,15m,5A,rp;14m,5A,do,li,5A,s,r(0),me,15m,5A,lp]

    can be simplified to


    [12rr,5A,do/ba,ar,5A,m,r(0),me,15m,5A;14m,5A,do,li,5A,s,r(0),me,15m,5A]

    with 10 elements in each element of the pair.

  2. The values r(0), r(1), r(2) etc. refer to repetitions (iterations) of smaller component microgestures which themselves -- presumably -- have the same structure as the simplex gestures of which they are parts. These values treat repetitions as modifying properties, but they have internal structure. Microgestures do not affect the status of a gesture as a simplex gesture in the sense introduced above.

    Without this iteration specification, therefore, the structure of the gesture vector can be simplified still further to


    [12rr,5A,do/ba,ar,5A,m,me,15m,5A;14m,5A,do,li,5A,s,me,15m,5A]

    However, this step necessitates distinguishing between two hierarchical gesture ranks, the microgesture and what was referred to above as a simplex gesture, and allowing for iterated microgestures to be embedded as whole gestures into the trajectory position of simplex gesture vectors.

2.2.4 Spelling out defaults

In two cases, default assumptions are used in order to shorten the vector:

  1. the distinction between static gestures (holds and postures), with 3 positions in CoGesT 1.0, and dynamic gestures (sometimes known as gestures proper), with 11 positions in CoGesT 1.0 (and combinations of these in gesture pairs yielding 3, 6, 11, 14 or 22 place vectors);
  2. the distinction between the trajectory direction subvectors with 1, 2 or 3 positions, and transcriber inconsistencies leading to arbitrary orderings of these.

The first of these cases, the distinction between static and dynamic gestures, is the main case. The static gesture vectors contain 3 positions:

  1. Location of Source
  2. Handshape at Source
  3. Macro for paired gesture

Dynamic gesture vectors contain 11 positions:

  1. Location of Source
  2. Handshape at Source
  3. Direction
  4. Shape of trajectory
  5. Handshape during trajectory
  6. Size of trajectory
  7. Repetitions during trajectory
  8. Speed of trajectory
  9. Target location
  10. Target handshape
  11. Macro for paired gesture

Of these 11 positions, 2 are shared with static gestures (as source), and 8 are concerned with movement trajectory (6 positions) and target (2 positions), while the last is concerned with relations between paired gestures.

In the case of the static gesture, the only two positions required correspond to the source positions of the dynamic gesture. The third is the position concerned with relations between paired gestures.

However, the source position can be interpreted quite naturally as identical with the target position in the case of static holds or postures. Consequently, the trajectory properties can consequently be specified as null. This analysis suggests that in the interests of formal simplification the static gesture vector can be spelled out into a vector of the same length as the dynamic vector.

In the case of the variants of the subvector for trajectory direction, all variants in ordering and length can be spelled out into a full direction subvector of length 3, and a normalised order of elements can be defined.

It is important to note that the trajectory subvector specifies movements, i.e. position translation functions, each of which effectively maps one value of a conventional 3-position spatial vector on to another (or possibly the same) value of another (or possibly the same) spatial vector.

Figure 1: Informal grouping proposal for revised CoGesT 1.0 vector.
\fbox{
\includegraphics[scale=0.52]{Figures/cogest10grouping.eps}
}

Figure 1 shows the informal grouping proposal for a revised CoGesT 1.0 vector. The three-dot sequences signify ``further, possible identical subtree sequences'', of arbitrary length in the case of Compound Gestures, of length 1 in the case of Gesture Pairs. The connecting lines in the tree signify ``part-whole relations'' between mother node and daughter nodes, as well as a left-right relation between daughter nodes. In the case of static gestures, the Route node is omitted. The trajectory subvector may be replaced by a subvector of iterated microgestures.

2.2.5 Grouping of related vector values into a hierarchical structure

The flat CoGesT 1.0 11-place vector for dynamic gestures, or the 9-place simplified vector proposed here, contains elements which are related to different degrees and which can be grouped into a hierarchy of subvectors according to these degrees of relatedness. For example, the elements of the trajectory:

[12rr,5A,do/ba,ar,5A,m,me,15m,5A;14m,5A,do,li,5A,s,me,15m,5A]

can be grouped informally into a tentative hierarchy as follows:

Source:
12rr, 5A
Dynamic:
(optional)
Trajectory:
Directionality:
do/ba
Modification:
ar,5A, m,me
Target:
15m, 5A

The grouping is visualised in Figure 1. The restructured vector for the first half is shown as follows:

[[12rr,5A],[[do,ba],ar,5A,m,me],[15m,5A]]

2.2.6 Analysis of redundancies and gaps

One case of redundancy has already been dealt with: the right and left designations of objects in the CoGesT 1.0 vector are redundant if the gesture pair is always represented.

Likewise, the representation of both source and target coordinates and a direction vector is redundant, as the direction vector can be calculated from the source and target coordinates. This can be shown by regarding the direction coordinates as functions mapping the Source position to the Target position. In a 3-dimensional model, the Source and Target positions would each be modelled by a 3-place spatial coordinate vector. The relation between a position vector and a direction vector can then be described as follows:

  1. The first element of the Source vector is mapped to the first element of the Target vector by the first element of the Direction vector (e.g. left-right, horizontal-lateral).
  2. The second element of the Source vector is mapped to the second element of the Target vector by the second element of the Direction vector (e.g. front-back, horizontal-sagittal).
  3. The third element of the Source vector is mapped to the third element of the Target vector by the third element of the Direction vector (e.g. up-down, vertical).

Since the function can be reconstructed from the extensional specification in terms of Source and Target vectors, it is, strictly speaking, redundant. Alternatively, given the Source and Direction information, the Target information is redundant. Nevertheless, the redundancy may be desirable, and the smaller amount of absolute spatial detail provided by the Direction vector alone may be sufficient to characterise a gesture if a degree of fuzziness can be tolerated. there is currently one restriction here, however: The spatial model underlying CogesT 1.0 is 2-dimensional - front-back distance from the body in the sagittal plane is not yet included. Specifically, the source and target coordinates in CoGesT 1.0 do not contain values for the sagittal (front-back) dimension and therefore, strictly speaking, there is no redundancy in CoGesT in respect of this dimension.

However, the sagittal dimension is required on independent grounds, in order to be able to cope with a scale which ranges from touching to various distances from reference points on the torso. This distance dimension is, incidentally, also required for arbitrary limb pairs involving, for example, hand-clapping, hands resting on the lap, etc., versus non-contact gestures involving two limbs.

Thorsten Trippel 2003-06-30