next up previous
Next: Detailed microstructure organisation Up: Lexical representation Previous: Lexicographic database representation

A conventional database record structure

Computational lexicography is often, without much reflection on the topic, concerned with the major European languages and other languages which promise heavy funding. However, in many ways the need -- and linguistic, as well as applicational interest (for example in prosthetic devices for the handicapped) -- is higher for other languages of the world. The following lexicographic database extract for Anyi, a language of the Kwa/Tano family spoken in the Eastern Ivory Coast, is taken from a comparative lexicographic database of Anyi dialects which the author is preparing in cooperation with the Université de Cocody, Abidjan, Ivory Coast.gif The following example record is formatted in Shoebox notation: characters are all ASCII, tex2html_wrap_inline1319xx tags are attribute, field or column names, followed by values for the field in question (lexeme, canonical form, English gloss, French gloss, German gloss, phonological representation, part of speech, date); the record always begins with the tex2html_wrap_inline1319lx attribute.

    tex2html_wrap_inline1319lx AKO
    tex2html_wrap_inline1319lc tex2html_wrap_inline1747
    tex2html_wrap_inline1319ge chicken
    tex2html_wrap_inline1319gf poulet
    tex2html_wrap_inline1319gd Huhn
    tex2html_wrap_inline1319ph tex2html_wrap_inline1757 tex2html_wrap_inline1759 HH
    tex2html_wrap_inline1319ps N
    tex2html_wrap_inline1319dt 09/03/1998

 

Field no. 1 2 3 4 5 6 7 8
Field name tex2html_wrap_inline1319lx tex2html_wrap_inline1319lc tex2html_wrap_inline1319ge tex2html_wrap_inline1319gf tex2html_wrap_inline1319gd tex2html_wrap_inline1319ph tex2html_wrap_inline1319ps tex2html_wrap_inline1319dt
Record 1 AKO tex2html_wrap_inline1747 chicken poulet Huhn ak tex2html_wrap_inline1759 HH N 09/03/1998
Record 2 ... ... ... ... ... ... ... ...
Table 2: Database record structure (microstructure) 

 

Glose française Forme de base Anyi
aimer k`ul´o
allumer
ami mija g´
Table 3: Anyi base forms with French gloss. 

 

Develarisation
Base: [kp]
kp tex2html_wrap_inline1263 p (Indenié)
kp tex2html_wrap_inline1263 p / V _ V (not Indenié)
Bona Rhotacisation, Sanwi Derhotacisation
Base = Indenié (l/r arbitrary non-contrast. altern.)
Indenié phonotactic licensing: C (V) l/r V
Bona phonotactic licensing: C (V) r V
Rule: l tex2html_wrap_inline1263 r Sanwi phonotactic licensing: C (V) l V
Rule: r tex2html_wrap_inline1263 l
Final palatal nasal deletion (not Indenié)
Base = Indenié
Rule: tex2html_wrap_inline1263 0 / _ #
Table 4: Output filter for Anyi comparative dialect database. 

 

No. [No.] French Anyi baseform Indenié Bona Moronou Tiassalé Sanwi
9. [116] aimer k` ul´ o k` ul´ o k` ur´ o k` ul´ o k` ul´ o k` ul´ o
10. [109] allumer
11. [31] ami mij a g´   mij a g´   mij a g´   mij a g´   mij a g´   mij a g´  
Table 5: Extract from full Anyi comparative database. 

The record structure, and a specimen record, are shown in Table 2.

The kind of microstructural information required in a comparative dialect lexicon, a variety of multilingual database, is shown in Table 3. In fact, this is a highly generalised database, containing only the base forms which are common to all Anyi dialects.

The database of common base forms is supplemented, as any database can be, by output filters; in the present case, these are dialect realisation rules, of which a selection is given in Table 4 (the names are of Anyi dialects).

Using the base form database, and the dialect rules implemented with UNIX tools (see Section 4.1, the more conventional dialect database excerpted in Table 5 was generated automatically.


next up previous
Next: Detailed microstructure organisation Up: Lexical representation Previous: Lexicographic database representation

Dafydd Gibbon
Thu Nov 19 10:12:05 MET 1998