The traditional word-level dyadic sign model
is no longer adequate for the range of dictionaries
required today in natural language processing and speech technologies,
or in the use of paper or electronic lexica.
The inadequacy rests on both descriptive and theoretical considerations.
A bilingual dictionary or translation dictionary
does not fit this model at all, since
in general word forms in
are matched with word forms in
,
and the issue of meaning is not directly addressed.
Specialised dictionaries, such as pronouncing dictionaries, also only
cover the form-form relation,
but between the two modalities of written and spoken language transmission
within in one language, rather than between two languages.
Further, it is unclear how the relation between word forms
in classical word-based dictionaries, and phrasal forms
in idiom dictionaries is to
be described, and how the distinctions between literal meanings,
figurative meanings and frozen meanings are to be captured for lexical
units of different ranks (stems, words, phrases, texts, dialogues).
Perhaps most importantly, the classical type of dictionary (referred to
indiscriminately by the man in the street as
`the dictionary') contains many different types of lexical information,
from orthography and pronunciation through grammatical word class and
internal morphological structure, to canonical meaning, special uses,
synonyms and antonyms, and etymology (word history), the majority of
which are not covered by the dyadic sign model.
Current work on lexicalist theories in general and computational linguistics (cf. [Bouma, van Eynde & Flickinger (this volume)]) is based on more complex sign models. These models describe lexical items, their properties, and a wide range of compositional and interpretative relations between them. However, even these models do not capture the notion of non-word-level lexical signs (e.g. morphemes or stems below the word level, phrasal signs above the word level), and do not integrate the many levels of phonological interpretation (from phonemic to word, phrasal and discourse properties).
The `integrated lexicalist' (ILEX) sign model on which the present overview is based is a more ambitious generic approach, and relates lexical and non-lexical signs at different compositional ranks, each with their own surface and semantic interpretation. The compositional and interpretative dimensions of the ILEX model are outlined in Figure 1; the other lexicalisation and generalisation dimensions are discussed in the text. The model will be taken up for detailed discussion in later sections.

Figure 1: Sign model with compositional and interpretative dimensions.
In this generic model, a sign is embedded in the more-or-less well-defined world in which it is used, indicated by the surrounding dotted line. Both non-lexical and lexical signs have two kinds of interpretation with respect to this world; signs are in the general case complex, and interpretation is compositional, based on the two main structural properties of signs, category (and subcategory etc.), and parts:
There are two main kinds of composition: on the one hand, parts are ordered in a hierarchy of rank, and on the other, each rank has its own hierarchical constituent structuring principles:
In the present context, attention will be limited to the word rank; however, lexical signs of other ranks (e.g. morphemes, phrasal idioms, ritual texts, routine dialogues) also exist.
Figure 1 shows the main types of information in the ILEX sign model, i.e. information about the interpretation and composition of signs. Further types of lexical information may be represented in practical dictionaries; in etymological dictionaries, for example, meta-information reconstructing the historical development of the basic types of information is represented.
However, the ILEX model contains two important further dimensions of information about signs which need to be added to those outlined in Figure 1:
Lexicography is, first, concerned with lexicalised signs of all ranks and constituent types, not just words, and, second, these lexicalised signs are not simply bundles of idiosyncratic word properties but can be grouped into classes on the basis of generalisations about their similarities and differences. It is the four dimensions of composition, interpretation, lexicalisation, and generalisation which, taken together, characterise the Integrated Lexicalist (ILEX) model of language signs.