next up previous contents
Next: DATR rules of deduction Up: The DATR formalism Previous: Theories and models

DATR syntax

The syntax of DATR expresses three kinds of hierarchical structure:

  1. Syntagmatic:
    1. Nested attribute value structures (here used to represent ID relations between and property assignment to signs),
    2. Hierarchies of sequences, with property percolation through the hierarchy expressed by `local inheritance', and lexical insertion expressed by `global inheritance',
  2. Paradigmatic: class inclusion (or implication) hierarchies expressed by local inheritance.

In DATR, nested AVMs are represented as nodes paired with conjunctions of equations. The left-hand side of each equation is an attribute path with attributes represented as atomsgif:

<struc parts modi int surf>

The right-hand side is a sequence of value expressions which may be either atoms or inheritance descriptors. There are two main kinds of inheritance descriptor, those which denote local inheritance and those which denote global inheritance, and in each case there are three subtypes of descriptor which constrain inheritance from different positions in the inheritance hierarchy: by specification of a node-path pair, a node alone, or a path alone. For each of these seven cases, i.e. atomic value expressions and the three types each of local and global inheritance, there are seven inference rules.

An important feature of DATR is that paths on the right-hand side are evaluable, that is, they have exactly the same formal structure as an entire right-hand side sequence, and may thus contain any value expressions, not just atoms. In particular including other paths, which may in turn include nested value expressions, and so on.

A selective version of the initial example pussy_willow, incorporating local (paradigmatic) and global (syntagmatic) inheritance, can be rendered in DATR as follows, with the IPA transcription characters rendered in a slightly modified version of the SAMPA ASCII coding of Wells (cf. [Wells 1989]), in which `/' is used to denote lexical stress:

% Query definitions (node-path pairs):
% All nodes except those declared under `hide',
% combined with all paths declared under `show':
# hide Noun Compound_noun .
# show <int mean> <int surf> .

% Lexical entry ranks (simplex and compound nouns):

Willow:
  <>                     == Noun
  <int mean qualia reln> == salix
  <int surf phon>        == 'w/Il@U'
  <int surf orth>        == willow.

Pussy:
  <>                     == Noun
  <int mean qualia reln> == felis
  <int surf phon>        == 'pUsI'
  <int surf orth>        == pussy.

Pussy_willow:
  <>                     == Compound_noun
  <struc parts head>     == "Willow:<>"
  <struc parts modi>     == "Pussy:<>"
  <int mean qualia reln> == ' RESEMBLE '
  <int surf reln orth>   == '-'.

% Paradigmatic inheritance hierarchy (<int surf reln> has default null value):

Compound_noun:
  <>               == Noun
  <int surf reln > == 
  <int mean>       == "<int mean qualia reln>" '('
                      "<struc parts head int mean qualia reln>" ,
                      "<struc parts modi int mean qualia reln>" ')'
  <int surf>       == "<struc parts modi int surf>"
                      "<int surf reln>"
                      "<struc parts head int surf>".

Noun:
  <>               ==
  <int mean>       == "<int mean qualia reln>".
The empty path, which appears as a left-hand-side under each node, is the path with no attributes specified. This is the most general path, and indicates the inheritance path to the next more general node or class. Any values which are explicitly specified in an equation associated with the current class override values of the same attributes specified at a higher (more general) node; in this case, the INT values are exhaustively specified, so only information about the category itself is locally inherited.

Information about the parts is globally inherited from each part lemma, the head Willow and the modifier Pussy. In HPSG terms, the HEAD features are inherited from the head or HEAD-DTR, and the COMP features are inherited from the modifier or COMP-DTRS.

Global inheritance means that the parts concerned are treated quite independently of each other and of the larger unit, ensuring compositionality (which can be modified if necessary for descriptive reasons).

Among the DATR equations that can be derived from the theory are the following:

Pussy:< int mean > = felis .
Pussy:< int surf phon > = pUsI .
Pussy:< int surf orth > = pussy .
Willow:< int mean > = salix .
Willow:< int surf phon > = w/Il@U .
Willow:< int surf orth > = willow .
Pussy_willow:< int mean > =  RESEMBLE  ( salix , felis ) .
Pussy_willow:< int surf phon > = pUsI w/Il@U .
Pussy_willow:< int surf orth > = pussy - willow .

 

DATR operation DATR notation AVM notation
Local node:path inheritance A:<b c d> tex2html_wrap_inline1362
Local node inheritance A A special case of tex2html_wrap_inline1362
Local path inheritance <b c d> tex2html_wrap_inline1450
(also a special case of tex2html_wrap_inline1362)
Global node:path inheritance "A:<b c d>" tex2html_wrap_inline1452
Global node inheritance "A" Rarely used.
Global path inheritance "<b c d>" tex2html_wrap_inline1454
Table 3: Inheritance operations. 

The DATR inheritance rules were the starting point for the definition of the paradigmatic and syntagmatic inheritance relations used in the AVM-based theory introduced in the preceding sections. For this reason, there is a simple mapping between the inheritance and compositionality operators used in the AVM theory, and the six inheritance operations defined for DATR, though not all the DATR possibilities are exhausted in the AVM theory (see Table 3). Atomic values are basically the same in each formalism.


next up previous contents
Next: DATR rules of deduction Up: The DATR formalism Previous: Theories and models

Dafydd Gibbon
Fri Mar 21 14:01:22 MET 1997