next up previous contents
Next: File information Up: No Title Previous: Contents

Database structure

The lexicon database currently defines a flat relation over unique orthographic lexical keys; these keys are used to join the separate databases submitted by projects A3, B1, C1, D1.

The databases submitted by these projects are first subjected to format normalising transformations, including:

  1. some redundant information is removed,
  2. the interdependent values for one project are joined into a single vector value linked by commas,
  3. multiple occurrences of orthographic keys are reduced by compressing their entries into disjunctions of value vectors linked by semicolons (the project oriented attributes are currently independent, so there is no information loss),
  4. vector disjunctions are joined into a single large database file with attributes separated by a single space (however, tab and space sequences are reserved).


Dafydd Gibbon
Sat Mar 23 23:18:08 MET 1996