HyprLex: VM Lexicographic Database
PhonSim
,
"hypr" source document
HyprLex: an experimental interactive lexicon tool
U Bielefeld Lexicon Group, August 1995
Notes on the PhonSim generator
D. Gibbon, U Bielefeld
30 Aug 1995
1. The phonological similarity of words
- Words as
phonological
units are defined in terms of unique combinations of characteristic properties, such as distinctive features and
phonetic
similarity, segmental sequence structure, syllable structure, accentual properties.
- Classes of phonologically similar items are defined in terms of smaller
subsuming
combinations of properties, i.e. generalisations, which may be understood operationally as the relaxation of distinctive constraints on the phonological structure of words. A similarity class based on phonetic similarity is known as a
natural class
.
- Phonological similarity is
model dependent
; one given model (e.g. phonemics) may define different properties from others (e.g. metrical or autosegmental phonologies).
- Phonological similarity is also combinatorially explosive: for a given inventory of properties, generally grouped into binary contrasts, i.e. two-valued, traditionally Boolean attributes, there are 2^n possible combinations of relaxations of these properties. Not all are phonetically and cognitively plausible, however, and some are dependent (via phonological redundancy rules) on others.
- The PhonSim generator permits experimentation with arbitrary combinations of the properties which are most characteristic of confusions in speech (on which there is a large literature), and selection of the most appropriate empirical grouping for a given task. The SELECT mechanism also has two pre-defined relaxation configurations; the combination labelled "Best" is on phonological grounds perhaps the most plausible combination.
- The PhonSim generator defines a
partitioning
of the vocabulary. It may be necessary to introduce overlapping criteria, however; the simplest solution for this is to define separate partitions based on the required criteria, and to form the set union of the two (or more) distinct sets of similarity classes.
- A classical, and rather different definition of phonetic similarity (or its inverse, "distance") is based on the phoneme mode and a string comparison function. Each position in one phoneme string is compared with each position in the other in order to calculate the number of insertions, deletions and substitutions required to match the strings. This number (the absolute distance) may be divided by the length of the string being measured (relative distance). Further, the phonemes may be weighted to capture the notion of greater or less similarity. The approach defines degrees of
phonemic
similarity pairwise between words, but it is not clear how this model relates to cognitive models of perception, or to more expressive hierarchical feature-based phonological models.
2. The attributes
Morphoprosodic relaxation:
-
Inflexions deleted:
Morphological constraints relate indirectly to pronunciation.
-
Accent ignored:
Accent may be lexically unpredictable.
-
Word to syll boundaries:
Word/syllable boundary distinction may be blurred.
-
Syll boundaries ignored:
Syllable boundaries may be blurred.
-
Schwa sylls indistinct:
Weak syllables are not easily recognised.
-
Weak sylls deleted:
Same, but more so.
Consonant relaxation:
-
No obstruent voicing:
In various positions, the voice contrast may be hard to detect.
-
Sibilant distinction ignored:
It may be advisable to keep this distinction.
-
Nasals/laterals merged:
These are relatively easily confused.
Vowel relaxation:
-
Vowel prosodies
-
No syllable-initial prevocalic glottal stop:
German speakers vary greatly in this respect.
-
Vowel length ignored:
Vowel lengths can be hard to distinguish.
-
Vowel qualities
-
Front vowels indistinct:
Front vowels merge in "fast speech".
-
Round vowels indistinct:
Round (including back) vowels also merge in "fast speech".
-
/a/ indistinct:
Low vowels may also merge.
-
Note:
When the last three constraints are all relaxed, no vowel distinctions are made.
Contact administrator
Caveat:
This is an experimental service which uses HTML Level 3 and Netscape features which may not be appreciated by some browser software. Documents are designed in "hypr", a default inheritance approach to modelling hypertext as a semantic network. The "hypr" compiler is written in
DATR
.
Dafydd Gibbon - 27.08.95