The source document is specified according to the general document model, i.e. according to the minimal set of 3 categories which have already been discussed:
Assuming that documents in various forms and in various media, with various subsets of the available information are to be produced, a source document is needed in which to store the information required, according to the abstract structure of the document, and with an appropriate storage layout.
The source document will then be used as the input to a set of filters, each of which converts the document into a different target document with a different Information Content selection, a different substructure of the Abstract Structure, and a different Layout Format, perhaps designed for a different medium.
Linguists working on documenting languages have traditionally used very many different kinds of format, ranging from very informal to highly formalised. Examples of different kinds of source document are, roughly in order of systematicness:
For example, the Abstract Structure of a lexical database will typically be a table or matrix.
An appropriate Layout Format for a tabular Abstract Structure could be a relation table in a relational database (in practice, there are structure variations in detail, for polysemy, pronunciation variants, etc.). A tabular format would be a very faithful representation at the Layout Format level of the Abstract Structure of a lexicon.