next up previous contents
Next: How to use this Up: Gazdar & Mellish: NLP Previous: Encoding of semantic information

Encoding of morphological information

The English verb can appear in an absolute maximum of eight distinct forms:

root  be
form1 am
form2 are
form3 is
form4 was
form5 were
form6 been
form7 being

Regular English verbs only appear in four distinct forms, all of them predictable from the root:

root  stamp
form1 stamp
form2 stamp
form3 stamps
form4 stamped
form5 stamped
form6 stamped
form7 stamping

Macro mor_regV:
    <mor form1 stem> = <mor root>
    <mor form1 suffix> = E
    <mor form2 stem> = <mor root>
    <mor form2 suffix> = E
    <mor form3 stem> = <mor root>
    <mor form3 suffix> = s
    <mor form4 stem> = <mor root>
    <mor form4 suffix> = ed
    <mor form5 stem> = <mor root>
    <mor form5 suffix> = ed
    <mor form6 stem> = <mor root>
    <mor form6 suffix> = ed
    <mor form7 stem> = <mor root>
    <mor form7 suffix> = ing.

( We still have to implement that form 7 of "love" is "loving" and not "loveing")

Lexeme love:
    <mor root> = love
    mor_regV
    syn_tV
    <sem> = love2a.

The line encoding the root is redundant because the root is identical to the lexical entry:

Lexeme xxx
    yyy
    ...
    zzz.

can be abbreviated to:

Lexeme xxx
    <mor root> = xxx
    yyy
    ...
    zzz.

This leads to:

Lexeme love:
    mor_regV
    syn_tV
    <sem>=love2a.

This lexical entry contains information about all distinct morphological forms of the verb "love", the syntactic items it combines with or subcategorizes for, as well as semantic information.

Written out in full detail:

Lexeme love:
    <mor root> = love
    <mor form1 stem> = love
    <mor form1 suffix> = E
    <mor form2 stem> = love
    <mor form2 suffix> = E
    <mor form3 stem> = love
    <mor form3 suffix> = s
    <mor form4 stem> = love
    <mor form4 suffix> = ed
    <mor form5 stem> = love
    <mor form5 suffix> = ed
    <mor form6 stem> = love
    <mor form6 suffix> = ed
    <mor form7 stem> = love
    <mor form7 suffix> = ing
    <syn cat> = V
    <syn arg0 cat> = NP
    <syn arg0 case> = nom
    <syn arg1 cat> = NP
    <syn arg1 case> = acc
    <sem> = love2a.

Encoding of information about irregular verbs of the class mor_presV, for example

   eat,give
     -> regular present tense and present participle            
     -> wholly idiosyncratic past tense forms
     -> past participles with "-en" suffixes

Macro mor_presV
    <mor form1 stem> = <mor root>
    <mor form1 suffix> = E
    <mor form2 stem> = <mor root>
    <mor form2 suffix> = E
    <mor form3 stem> = <mor root>
    <mor form3 suffix> = s
    <mor form4 stem> = <mor form5 stem>
    <mor form4 suffix> = E
    <mor form5 suffix> = E
    <mor form6 stem> = <mor root>
    <mor form6 suffix> = en
    <mor form7 stem> = <mor root>
    <mor form7 suffix> = ing.

Now our lexicon looks like this:

Lexeme die:
    mor_regV
    syn_iV
    <sem> = die1a.

Lexeme elapse:
    mor_regV
    syn_iV
    <sem> = elapse1a.

Lexeme eat:
    mor_presV
    <mor form4 stem> = ate
    syn_iV
    <sem> = eat1a.

Lexeme eat:
    mor_presV
    <mor form4 stem> = ate
    syn_tV
    <sem> = eat2a.

Lexeme give: 
    mor_presV
    <mor form4 stem> = gave
    syn_tV
    <sem> = give2a.Lexeme give:
    mor_presV
    <mor form4 stem> = gave
    syn_dtV
    <sem> = give3a.

Lexeme give:  
    mor_presV
    <mor form4 stem> = gave
    syn_datV
    <sem> = give3b.

Lexeme hand: 
    mor_regV
    syn_dtV
    <sem> = hand3a.

Lexeme hand:
    mor_regV 
    syn_datV
    <sem> = hand3b.

Lexeme love:
    mor_regV
    syn_tV
    <sem> = love2a.



Dafydd Gibbon
Thu Feb 12 11:04:00 MET 1998