The Accidental Taxonomist. By Heather Hedden, Information Today, 2010, 442 pages, ISBN 978-1573873970
TAXON''OMY, n. [Gr. order, and law.] Classification; a term used by a French author to denote the classification of plants.
Webster's Revised Dictionary (1828 Edition) 
Tax*on"o*my (?), n. [Gr. an arrangement, order + a law.] That division of the natural sciences which treats of the classification of animals and plants; the laws or principles of classification.
Webster's Revised Dictionary (1913 Edition) 
What it means to have an involvement with taxonomy, or to study taxonomy, can be seen by how much the above definitions have changed in the last two centuries. The Accidental Taxonomist provides a utilitarian, thoroughly modern viewpoint on the issue, and presents a grab-bag of introductory information, background knowledge, detailed information of relevance to the area – from concept creation to Z39.50, automated indexing, management software, visualisation and maintenance.
Hedden suggests that the term 'taxonomy' has undergone a significant drift in the last 20 years, and 'is now understood to mean information organisation in general'. For Hedden, taxonomy as a term may refer to controlled vocabularies, navigation labels, categories, and standardised terminologies. They may appear in many different contexts of use; content management systems, Web sites, catalogues, indexes, business knowledge bases, and so on. She acknowledges that individual contexts already have their own specialised terminology for describing this sort of work, 'classical terms', such as 'authority file', 'controlled vocabulary', and 'thesaurus'. She suggests that, despite the variations, to work in these areas, whether as an information architect, librarian, or knowledge manager, is nonetheless a function of the taxonomist.
This usage of the term, she acknowledges, provides significant overlap with knowledge organisation system, a term that is used by the NKOS (Networked Knowedge Organisation Systems Working Group) . This term, however, is used sparingly, as it 'has not caught on in the business world and is not likely to do so'. It is true that the agent noun that she proposes, knowledge organisation system creator, is clunkier than taxonomist, and is unlikely to win any prizes for brevity.
As an introduction, this tells the reader what he or she can expect to find: a practical approach to knowledge organisation work. But it also suggests that this book is an attempt to build a portrait of a new kind of job description: the taxonomist, a one-size-fits-all definition that transcends barriers of organisation and community discourse to draw a broad functional equivalence between an apparently wide range of roles. As the author says (Introduction, p.xxvii), 'This book also serves the purpose of cross-training existing taxonomists for different kinds of taxonomy projects. If we want to carry the label of taxonomist and move from one job to another, then a broader understanding of the types of work and issues involved is needed.'
This notion – the definition of the term 'taxonomist' as an expert role, transcending the precise domain of application, like that of a chartered engineer or physicist – also accounts for the title of the book. According to a survey conducted by Hedden, many people approach taxonomy development by accident, without having received relevant training such as might be offered by a library science or information science qualification. A 2008 survey of those self-identifying as taxonomists showed that under half had MLS or MLIS degrees (p. xxiv). The 'accidental taxonomists' are those who find themselves in the role. This book, then, is intended as a lifebelt for the drowning taxonomist-to-be and a reference for the practising taxonomist.
The book is laid out in twelve chapters and a series of appendices (the results of the aforementioned survey, a welcome glossary, a slightly skeletal list of recommended reading, and an index). The chapter layout is, broadly speaking, self-explanatory. Chapter 1: What are Taxonomies? –briefly describes a wide series of structures, from the authority file to the ontology, offers a little history, and explores the term's use as a Noughties* buzzword. Chapter 2: Who are Taxonomists? is simply a directory of the roles and agencies that define the area. In Chapter 3: Creating Terms and Chapter 4: Creating Relationships, relevant background theory and examples are provided, along with advice and best practice.
Chapter 5 moves on to discuss Software for Taxonomy Creation and Management, beginning with the results of a survey into the types of software used for the purpose of creating and managing taxonomies, that shows that nearly 20% of respondents use general-purpose software, such as Excel, for the purpose. Again, this chapter shows the practical approach of the book, discussing the use of Excel alongside other categories of software – mind mapping, ontology development software, and so forth. It frequently offers advice: 'You may want to consider… you will want…' and so forth. The author notes that taxonomy software is a vague notion, and indeed, having accepted such a broad definition of the term, it is. Those looking to create an ontology, for example, might want to pick up a more detailed book, such as The Semantic Web for the Working Ontologist. However, the introduction provided here offers a broad overview – although some areas, notably thesaurus development, receive a disproportionately large amount of attention.
Chapter 6 discusses Taxonomies for Human Indexing: tagging, or keywording, which may not 'necessarily imply using a taxonomy (controlled vocabulary)'. The agent noun for this work, she suggests, is indexer – for her, there are no taggers (a term which, nonetheless, she uses on p.257). This chapter reviews tagging, cataloguing, classification, and indexing, discusses the role of human indexers, and reviews relevant concepts, such as nonpreferred terms. It discusses the management of taxonomies and of folksonomies. The chapter concludes with the statement that 'a folksonomy … is not an alternative to a taxonomy but rather is supplemental. Each has its own place and purpose.'
Chapter 7: Taxonomies for Automated Indexing, discusses the broad area of automated indexing, presenting the approach as fast, useful over structured document types and uniform subject areas, available primarily for text content, and relevant for, 'a corporate culture that is more comfortable with investing in externally purchased technology than in hiring, training and managing human indexers.' This sentiment is interesting, as automated indexing is not in practice solely the domain of commercial enterprise, but it is perhaps unsurprising that this is perceived to be the case. The chapter passes quickly over information extraction and text analytics, entity extraction, automatic categorisation, and so forth. The detail given is on the level of broad general knowledge. Some of the assertions presented as factual in this chapter are difficult to justify entirely, so for the technically inclined reader looking specifically for detailed information in this area, it may be preferable to look elsewhere.
Chapter 8 moves on to discuss Taxonomy Structures; the characteristics of hierarchies, such as depth and breadth, the implementation of facets, and the use of multiple vocabularies and categories. Taxonomy Displays are discussed in the following chapter, by reference to the intended use case for the interface; for the trained taxonomist, the human indexer, the subject area researcher and the end-user searcher.
Chapter 9 discusses Taxonomy Displays, by reference to the intended use case for the interface; for the trained taxonomist, the human indexer, the subject area researcher and the end-user searcher. Notably, the chapter explores thesaurus displays, hierarchical taxonomy displays and fielded search displays.
In Chapter 10 and 11, we move on to the final phase of the book; discussing firstly taxonomy planning, design and creation, and then taxonomy implementation and evolution. A set of relevant standards are briefly introduced, and issues in taxonomy creation and maintenance, and multilingual use are discussed.
The final chapter returns to the theme of the introduction: Taxonomy Work and the Profession. It reviews the findings of various studies, such as a 2009 survey of a Taxonomies and Controlled Vocabularies special interest group, discussing what taxonomists enjoy about their jobs. The recurring theme is this: the taxonomist, as a role, is difficult to explain, and is not always taken seriously. The chapter describes the likely profile of taxonomy contract work, with some skeletal descriptions of the process of a consultancy or a freelance role in the area. The lack of dedicated education for the would-be taxonomist is lamented, and a number of LIS and IS programmes are named, along with continuing education opportunities such as webinars, independent training and conference workshops, and professional associations.
By this point in the book, however, the term taxonomist had begun to lose its charm. It is a broad term, perhaps overly broad, and in this book is applied across domains in a manner that suggests greater familiarity with some than others. It may well be that creation of a new professional class of generalist in the area of knowledge organisation systems proves to be a useful move, and the concerns of those working in the domain, as described through the survey results presented, certainly suggests that there is difficulty communicating the role and its worth. However, the payload of the book does not make this argument as clearly as perhaps it would like to. In descriptive text alone, one can achieve only so much. The boundaries between the various structures discussed are blurred only in part for the reason identified in the introduction: that is, because they are discipline-specific efforts to achieve similar aims. But underlying that discipline specificity is often a great deal of application-specific complexity, which receives little attention in this book.
The Accidental Taxonomist is a reasonably-priced book - $39.50 for an overview of a topic area that is readable, direct and to-the-point. The cover blurb describes it as a 'guide to the art and science of building information taxonomies'. I would suggest that as a guide to the landscape, it is in most areas a success. Viewing each area from the author's perspective of knowledge and experience is generally a useful experience, and it is written in a readable if occasionally overly didactic and under-referenced manner. Despite the promise of the blurb, though, it is not intended as a guide to the science, in the sense of mathematical or computational detail. On reaching the end of the book, the reader will have gained an awareness of the basic components of each domain, and is armed with a great deal of useful information. A book that covered the details would by necessity have to be a great deal longer and probably less readable – but might provide more insight into the question of what defines a taxonomist in each domain.
Editor's note: In that rather flexible manner that English displays, following the logic of Eighties and Nineties, the first decade of the 21st century has been termed quite frequently as the 'Noughties.'