Computers & Texts No. 12
Table of Contents
July 1996
Mirrored from Computers & Texts for temporary offline use. DG, 2.2.1999

Review: Chaucer, Johnson, and Shakespeare on CD-ROM

Michael Fraser
CTI

...

A Dictionary of the English Language

I knew that the work in which I engaged is generally considered as drudgery for the blind, as the proper toil of artless industry; a task that requires neither the light of learning, nor the activity of genius, but may be successfully performed without any higher quality than that of bearing burdens with dull patience, and beating the track of the alphabet with sluggish resolution.
So Samuel Johnson commenced his letter to the Earl of Chesterfield on The Plan of an English Dictionary (1747). Some (who know no better) might be tempted to suggest that nearly 250 years later this description might be better applied to those who spend their time encoding electronic texts so that the rest might easily navigate and search their contents. But, of course, as the Wife of Bath's Prologue amply demonstrates this is most certainly not the case. The encoding of an electronic edition so that the structure is made apparent, the content easily searchable, and the whole attractively presented is not a task for the light of learning even if a considerable portion is fairly harmless drudgery.

Two editions of Samuel Johnson's Dictionary of the English Language have been published on CD-ROM by Cambridge University Press. The first, produced by Johnson in 1755, and the fourth, revised and published by Johnson in 1773. Entries from both editions can be viewed simultaneously on the screen. The electronic edition, like the Wife of Bath's Prologue, is encoded in TEI-SGML and presented with DynaText. This gives the CD-ROM a similar appearance to the Wife of Bath's Prologue and indeed it is only necessary to have installed one DynaText reader together with the specific fonts in order to view any one of the three CD-ROMs reviewed here.

The structure of Johnson's dictionary falls into the transcriptions and the digitized images of each page of each edition. Although it is possible for the dictionary to be navigated by the transcription, moving, for example, from the letter A to ABE... to Abecdary (Belonging to the Alphabet) it is more useful to locate words using the search forms provided.

The value of a work must be estimated by its use; it is not enough that a dictionary delights the critick, unless, at the same time, it instructs the learner; as it is to little purpose that an engine amuses the philosopher by the subtilty of its mechanism, if it requires so much knowledge in its application as to be of no advantage to the common workman.
The subtilty of the underlying encoding system might well amuse the inclined philosopher. However, the common academic is not required to understand more than the basics in order to make good use of it. Readers who have been duly impressed by the search capabilities of the Oxford English Dictionary on CD-ROM will be pleased to know that similar searches can be carried out on the OED's illustrious predecessor. Such searches are only possible because the editor, Anne McDermott, included the encoding of many of the elements identified by the TEI's Guidelines for print dictionaries (headword, part of speech, etymology, usage, sense, definition etc.).

[Screen Shot]

Johnson's Dictionary: Entry, transcription, and digitized image from the first edition.

The forms interface gives the option of searching the complete dictionary for a keyword or limiting the search to within the headword, definition, quotation, first or fourth edition, quoted author or title. If that is not sufficient then more complex searches can be entered using the underlying markup. This is particularly useful for proximity or Boolean type searching but also for giving access to the additional features encoded in the dictionary.

Barbarous, or impure, words and expressions, may be branded with some note of infamy, as they are carefully to be eradicated wherever they are found; and they occur too frequently, even in the best writers.
One of the pleasures afforded this common workman in the review of Johnson's Dictionary was attempting to reveal the voice of Johnson beneath the dull (as, to make dictionaries is dull work) defining of everyday words. Often cited, before even inspecting the electronic edition, are Johnson's definitions of lexicographer (a harmless drudge), oats (a grain, which in England is generally given to horses, but in Scotland supports the people), or to worm (to deprive a dog of something, nobody knows what, under his tongue, which is said to prevent him, nobody knows why, from running mad).

One of Johnson's primary concerns in compiling his dictionary was for the purity of the English language. A substantial number of 'barbarous' words are to be found in both editions of the dictionary. Placed there not, one suspects, because his dictionary was intended to be a snapshot of eighteenth century English usage, but rather because such words, being offensive to Johnson's ideal of purity through etymology, were placed in the dictionary to indicate to the common workman precisely which words he should not be using. In total 49 words are described by Johnson as 'barbarous'. A search specified in the form '<entryfree> cont (<note> with type=usg cont barbarous) and (<author> cont shakespeare)' will find, amongst others, those occurrences where Shakespeare himself employed such words (vastidity, worser).

Far more common are instances of 'low' (258) or 'cant' (154) words. Cant is defined by Johnson as 'a corrupt dialect used by beggars and vagabonds', 'barbarous jargon' or ' a whining pretension to goodness, in formal and affected terms'. Examples of the cant include 'black-guard' (a cant word amongst the vulgar), 'confounded' (hateful; detestable; odius as in 'He was a most confounded Tory' -Swift), 'mundungus' (stinking tobacco) and 'slim' (slender, thin of shape; a cant word as it seems, and therefore not to be used). The latter is an example of Johnson's attempt to educate by proscription. Johnson's aim to eradicate the English language of cant or spurious words peaks in the few instances where he presents the headword then the definition followed by the comment that, 'in this sense it is not used'. One can perhaps understand this where 'not used' is an addendum to a word in the fourth edition previously defined without comment in the first edition (Calmy: calm; peaceful or Preach: noun, a discourse, religious oration). This is not the case with the first definition given for 'snuff' (Snot. In this sense it is not used) which appears in both the first and the fourth editions. On finding this one immediately desires to consult the Oxford English Dictionary which has duly taken note of Johnson's claim and not included 'snot' among the definitions given for 'snuff'. Unfortunately, it is nearly impossible to search for all occurrences of 'not used' in the dictionary because 'not' has been designated a stopword and is thus ignored in all searches. As one might expect in a work of this nature the form 'used' is present in great frequency. One thus tends to stumble on Johnson's proclamations quite by accident. The work of purifying the English language, however, continues even if, on occasions, literature can impede its progress (Primal: First. A word not in use, but very commodious for poetry).

So that in search of the progenitors of our speech, we may wander from the tropick to the frozen zone, and find some in the valleys of Palestine, and some upon the rocks of Norway.
Picking out the Norwegian, the Indian, the Icelandic, the Irish and the Saxon, the Greek and the Hebrew words is, at first sight, easy enough. Searching with the '<etym>' tag containing some specified language shows 4131 words with some reference to Saxon etymology, 9763 from Latin, 5655 from French, and only 433 with reference to German. However, one cannot be sure of the accuracy of this method of searching. Greek words serve to demonstrate the point. One can search for '<etym> cont Greek' and retrieve a paltry 42 words. A browse through the dictionary shows that Johnson was not consistent in his specification of etymology. He also uses the abbreviation Gr. or, on most occasions, leaves it to the reader to recognise Greek on sight. The advantage, however, with Greek words in the electronic edition of the Dictionary is that a separate character set is required. Thus, there is an extra tag within '<etym>' which specifies Greek (<lang="gk">). Searching for all instances of Greek within the etymology retrieves a far more realistic 4307 words. It is unfortunate that a similar tag was not used for all languages and so regularizing the etymological entries. Thus one could ensure that a search for 'latin' would also pick up lat. (a total of 19035 entries) and instances where neither form are used. I was rather disappointed to discover that '<lang>' had not been used for Hebrew words. I assume because no separate character set was defined and instead the publishers decided to insert graphic images of each Hebrew word.

Finally, One word was recognised by Johnson as predating the Flood and the fall of the Tower of Babel. That word was 'sack', to be 'found in all languages, and it is therefore conceived to be antediluvian'. The Oxford English Dictionary very nearly agrees with Johnson on this point, but confines itself to referring to the word as having a prehistoric type.

This, my Lord, is my idea of an English dictionary; a dictionary by which the pronunciation of our language may be fixed, and its attainment facilitated; by which its purity may be preserved, its use ascertained, and its duration lengthened.
An idea hardly fulfilled by Johnson's Dictionary. What was cant then is elegant today and Johnson's refinements are today's slang. The English language evolves as it ever did. The ease and variety of ways in which an eighteenth century representation of English can be consulted on a twentieth century spinning mechanical disk fulfils many of the aims for the dictionary that Johnson had hoped his dictionary would do for the language of England. He would surely have approved this and its future provision on something reticulated or decussated, at equal distances with interstices between the intersections.

...

[Table of Contents] [Letter to the Editor]


Computers & Texts 12 (1996), 21. Not to be republished in any form without the author's permission.

HTML Author: Michael Fraser (mike.fraser@oucs.ox.ac.uk)
Document Created: 22 August 1996
Document Modified:

The URL of this document is http://info.ox.ac.uk/ctitext/publish/comtxt/ct12/fraser.html