next up previous contents
Next: Specification of the Token Up: File Specification Previous: File Specification

Specification of the DATR theory file

 

The input file to zdatrtok should satisfy the format described in BNF:

  <theory>      ::= <atomexeceptions> <theory> |
                    <nodeexeceptions> <theory> |
                    <variablelist> <theory> |
                    <sentence> <theory> |
                    <sentence>
 
  <atomexeceptions>::= # atom <itemlist> .
  <nodeexeceptions>::= # node <itemlist> .
  <itemlist>    ::= <item> | <item> <itemlist>
  <item>        ::= <char_string> | ' <any_string> '
 
  <variablelist>::= #vars <variabledefinition> .
  <variabledefinition> ::= <variablename> : <list_of_atoms>
  <list_of_atoms>::= atom <list_of_atoms> | atom	
 
  <sentence>    ::=  <node> : <eqseq> .
  <eqseq>       ::=  <equation> | <equation> <eqseq>
  <equation>    ::=  <lhs> <rhs>

  <lhs>         ::=  < <atomseq> >
  <rhs>         ::=  <extrhs> | <defrhs>

  <extrhs>      ::=  = <atomval>
  <atomval>     ::=  <atomseq> | ( <atomseq> )
  <atomseq>     ::=  <epsilon> | <atom> <atomseq>

  <defrhs>      ::=  == <descval>
  <descval>     ::=  <descseq> | ( <descseq> )
  <descseq>     ::=  <epsilon> | <desc> <descseq>

  <desc>        ::=  <atom> | <pointer>
  <pointer>     ::=  " <spec> " | <spec>
  <spec>        ::=  <node> : <descpath> | <node> | <descpath>
  <descpath>    ::=  < <descseq> >

  <node>        ::=  <upper_case> <char_string>
  <atom>        ::=  <not_upper_case> <char_string> | ' <any_string> '
  <variablename>::=  $ <char> <charstring>

  <char_string> ::=  <epsilon> | <char> <char_string>
  <any_string>  ::=  <epsilon> | <any> <any_string>

  <res_char>    ::=  [!:<>"()='.%] 
  <char>        ::=  ! [0...256] | [ASCII(33)...ASCII(127)]-[<reschar>]
  <any>         ::=  [ASCII(9)...ASCII(127)]-[']
  <upper_case>  ::=  [A-Z]
  <not_upper_case> ::= [<char>] - [<upper_case>]
Comments are allowed, everything between the comment marker '%' and the next linefeed will be ignored. To create control characters for formated output, an exclamation mark '!' sequence can be used:

Sequence Translation
!t represents a tabulator.
!n represents a newline.
!! represents a ! (same as !33, but do not mix them).
!X Xtex2html_wrap_inline505{1...255}, ASCII(X)
!7 e.g. ASCII(7), the Bell
Please note that the definition of variables as defined by Evans & Gazdar is not yet fully implemented: the '-' operator and other set arithmetic operators in variable declarations is not supported in the current version.

Atoms that do not satisfy the standard definition of atoms can be defined with the # atom tag. Non-standard nodes can be defined with the # node tag.

Any number of WHITESPACE characters may appear between the # and the three tags, but note that a declaration continues to the next non-quoted dot. Dots within quoted symbols are ignored.

Although the tokenizer can recognize many errors and shows them together with filenumber, description and context, it is possible to produce misleading error messages due to the way the top down parser processes the input. Nevertheless, the line number yields the correct location of the error.

Sample theory and demo files are included in this distribution in the directory zdatr/example/.



Dafydd Gibbon
Fri Mar 21 17:58:24 MET 1997