simplex annotations annotations and and complex complex annotations annotations dg dg 20040427 20040427 internal internal lexicography lexicography workshop workshop description description of of a a speech speech or or language language corpus corpus general general term term annotation annotation term term used used in in speech speech technology technology labelling labelling term term used used in in text text technology technology markup markup a a simplex simplex annotation annotation eg eg label label is is an an event event eg eg an an occurrence occurrence of of a a word word phoneme phoneme syllable syllable feature feature at at a a specific specific interval interval in in a a corpus corpus a a complex complex annotation annotation consists consists of of one one or or more more tiers tiers of of events events a a tier tier is is a a sequence sequence of of events events of of the the same same type type allen allen relations relations calculus calculus of of intervals intervals 13 13 relations relations event event logic logic axiomatic axiomatic relational relational logic logic =def =def pair pair of of a a property property and and an an interval interval johan johan van van benthem benthem amsterdam amsterdam applied applied to to phonology phonology in in order order to to explicate explicate autosegmental autosegmental phonology phonology by by steven steven bird bird and and ewan ewan klein klein 1989 1989 event event phonology phonology event event property property interval interval attribute attribute value value interval interval attribute attribute value value t_start t_start t_end t_end examples examples of of annotation annotation xwaves xwaves espswaves+ espswaves+ propertyt_end propertyt_end eg eg table table 1030 1030 problem problem the the beginning beginning is is only only implicit implicit and and has has to to be be inferred inferred by by the the user user or or added added in in ad ad hoc hoc fashion fashion implicit implicit partial partial interval interval definition definition sam sam propertyt_startt_end propertyt_startt_end eg eg table table 1030 1030 1659 1659 corresponding corresponding to to orth orth table table 1030 1030 1659 1659 praat praat same same as as sam sam but but with with its its own own notation notation tasx tasx same same as as sam sam but but with with xml xml notation notation how how does does this this relate relate to to the the lexicon? lexicon? lexical lexical acquisition acquisition list list of of lexical lexical items items eg eg a a wordlist wordlist problem problem what what is is a a word? word? make make list list by by converting converting the the text text to to a a list list of of words words sorting sorting the the list list of of words words removing removing duplicates duplicates extract extract corpus corpus properties properties of of the the list list items items ie ie microstructure microstructure elements elements which which can can be be inferred inferred from from corpus corpus relations relations by by frequency frequency count count absolute absolute or or relative relative percent percent rank rank ordering ordering lexical lexical representation representation macrostructure macrostructure overall overall structure structure of of dictionary dictionary mesostructure mesostructure generalisations generalisations over over microstructures microstructures definitions definitions of of grammar grammar pronunciation pronunciation cross cross references references eg eg semantic semantic relations relations references references to to corpus corpus eg eg concordance concordance examples examples microstructure microstructure types types of of lexical lexical information information datcats datcats data data categories categories eg eg structural structural properties properties can can be be extracted extracted from from corpus corpus external external context context eg eg collocations collocations internal internal structure structure eg eg derived derived compound compound words words idioms idioms interpretative interpretative properties properties meaning meaning semantic semantic pragmatic pragmatic form form phonetic phonetic orthographic orthographic metadata metadata properties properties local local housekeeping housekeeping properties properties lexicographer lexicographer source source dates dates of of creation creation modification modification note note there there are are global global metadata metadata properties properties which which which which apply apply to to the the whole whole lexicon lexicon eg eg language language corpus corpus used used publication publication details details note note macrostructure macrostructure contains contains mesostructure mesostructure contains contains microstructure microstructure lexical lexical access