Computerised lexicographic databases reflect many facets of traditional lexicography rather closely. The most common kinds of Database Management System (DBMS) are based on a relational model [Draxler (this volume)], and in their simplest form can be visualised as a matrix or table, in which the rows constitute the lexical entries, and the columns define the lexical microstructure for each entry; an example will be given later. A full relational database consists of a set of interlinked tables (relations) of this kind, often modelled by a so-called `entity-relationship diagram'. Neither paradigmatic nor syntagmatic generalisations are captured well by this kind of structure; any generalisations must be searched for on demand using classification algorithms. Object-oriented DBMS, with inheritance mechanisms related to those of DATR, and hybrid object-oriented/relational databases, are likely to supersede relational databases in time.
In practice, the most well-known, and rather widely accepted modern variety of lexicographic database representation is lexical text markup often using SGML (Standard Generalised Markup Language), in which labelled bracketings are used to indicate the microstructure and certain aspects of the macrostructure. For further information about these aspects of current lexicography see [Leech, Myers & Thomas 1995]. It must be said, however, that SGML suffers from the same deficiencies as traditional lexica and basic lexicographic databases:
For many reasons, among others its relative simplicity, SGML use is on the increase as a representation language for traditionally structured lexicons. An important factor which favours the spread of SGML (with derivatives such as XML) is that it is the specification language for the document types used on the World Wide Web, particularly HTML.
There are four major (and many minor) prerequisites to the design of any lexicographic database:
The prerequisites are given in order of importance from the lexicographic point of view. In practice, of course, there may be lower order practical constraints such as price, availability, databases or computing platforms already in use, which force higher order choices. For example, selection of a DBMS may be based on availability of a proprietary database management system (DBMS) like Access, Paradox, Oracle, or the Shoebox basic lexicographic database system distributed by the Summer Institute of Linguistics (SIL). DBMS specification is the implementation level analogue of macrostructure specification: the choice is between a flat DBMS (though perhaps with hierarchical records, like Shoebox), a relational DBMS with a main relation and sub-relations, an object-oriented DBMS, a hybrid relational-object-oriented DBMS, or a hypertext document.
However, DBMS aside, the main selection is initially the definition of the appropriate macrostructure and its mapping into the record structure of the DBMS, with specifications such as the following: semasiological (orthographic list vs. pronunciation lexicon ...) vs. onomasiological (synonym list vs. hierarchical thesaurus ...) vs. multilingual lexicon ... The macrostructure specification thus determines the basic unit represented by the database record.
The linguistic specification phase is of primary importance in the present context. At the very least, the linguistic content of the database must be known, but ideally a comprehensive specification of the lexical organisation and types of information is desirable.
The microstructure definition completes the linguistic specification, and is the most difficult part of the procedure, involving detailed linguistic analysis. Typical questions to be resolved include morphological paradigm definition (e.g. standard inflectional categories), lemmatisation (i.e. extraction of a canonical reference form from morphological variants), syntactic analysis (definition of a part of speech set, with carefully chosen granularity of subcategories such as VERB, VERB_TRANSITIVE, VERB_DITRANSITIVE...), semantic analysis (semantic components, relations, fields, frames etc.), pragmatic analysis (functional, dialectal, sociolinguistic usage). A microstructure corresponds to what is traditionally known as `types of lexical information', and may vary from simple glossary or spelling-pronunciation tables to vectors of theoretically well-founded categories as in the following selection:
Classical theoretical lexicology as represented by the work of Fillmore (modified from [Fillmore 1971], p. 370):
Contemporary formal sign-based lexical microstructure as in HPSG [Pollard & Sag 1987], p. 108;
the boxed indices denote shared substructures (Figure 5).

Figure 5: Attribute-value structure for HPSG 1987.
In a lexicographic database, only the vector of most deeply embedded values would be used; the hierarchical structure would not be directly represented but `squashed' into a flat value vector. Complex objects could then be represented as sub-relations for the purpose of describing cross-referencing (re-entrancy, structure sharing).
The later version of HPSG [Pollard & Sag 1994], p. 82, simplifies the outer levels of this structure (Figure 6).

Figure 6: Attribute-value structure for HPSG 1994.
Lexical semantic microstructure, as in Pustejovsky's Generative Lexicon Theory (for feature structure details, see [Pustejovsky 1995]).
The following is an example of a Generative Lexicon microstructure (p. 82), which uses essentially the same formalism as HPSG (Figure 7):

Figure 7: Attribute-value matrix for Generative Lexicon Theory.