Tillmann extended the three-component model to a five component model (cf. []), from the surface behaviour of the articulatory organs, through the acoustic configuration, the acoustic speech signal, acoustic events in the ear, to surface reactions of the auditory organs.
Figure 2: Tillman's `signal phonetic tape'
The `signal phonetic tape' model can easily be extended to cover physiological events in the muscles, the nerves and the brain. However, the level of idealisation in this model needs to be reduced still further in order to account for the complex parallel events which characterise speech signals at all phases. In particular, a third dimension is required: each of the signal types in the signal phonetic tape is a stream of values in time, related to the others, point for point, by a system. The system is represented in the Figure by a downward diagonal arrow.
Table 1: The signal phonetic tape model projected on to the time dimension.
This diagram is perhaps better interpreted as a cascade of systems, whose signals occur with slight time delays with respect to the preceding stage. Each system thus generates a stream of values; the set of parallel systems thus generates a stream of vectors of parallel values.
This model provides a functional or operational explanation of the structure of speech signals.
Within the AA system, Fant distinguished between two major system types, the Source Systems and Filter Systems, in a model known as the Source-Filter Model of sound production. Further distinctions need to be made (for instance, the Source needs to be supplemented by a Frequency Modulator)
The neighbouring signals are related by a system, which transform an input into an output.
Figure 3: Inputs and outputs in a signal-system cascade.