Building A Taxonomy
Last Updated Jan 2009
By: Mark Bennett & John Lehman, HighClassify - NIE Enterprise Search: Issue 2 - June, 2003
"Yesterday, I didn't even know how to spell taxonomy, and now I need to make one!". What skills does it take? Can I buy one? Where are taxonomies found? Do we already have one? What's a parent? What's inheritance? What am I doing?
A taxonomy is a "subject map" to an organization’s content. A taxonomy reflects the organization’s purpose or industry, the functions and responsibilities of the persons or groups who need to access the content, and the purposes/reasons for accessing the content.
First and foremost, a taxonomy is an information access tool, and will be implemented in a manual or automated system. Secondarily, a taxonomy is a communications and training device, providing history, expertise and inside information that assist every employee, customer and prospect. Like any other information access tool, a taxonomy needs requirements and purposes before it is developed. The figure below illustrates the five major steps and information flow in the taxonomy building process.
In "Taxonomies For Practical Information Management" (April 2003 Newsletter), these eight perspectives, or families of taxonomic elements, which apply to all organizations, were posed:
- Industry Segments
- Organizational Functions
- Business Relationships
- Business Issues & Events
- Products & Services
- Technologies
- Geography
- Types of Records or Documents
These perspectives are part of defining the domain, or scope-and-subject area, of a taxonomy. Taxonomies are hierarchical; that is they start with general or broad subjects, similar to the perspectives above, and gradually lower the level of detail to very precise classifications, such as a product proper name or specific customer. But because Users and Roles are also important to the taxonomy, other representations of the taxonomy, such as alphabetical, or business function based, are often as important as the hierarchy. Taxonomies are not trivial; they regularly contain 4-8 levels of detail and hundreds of individual classifications-subjects, and require entry points and navigation that meets the full spectrum of users.
Locating taxonomies or taxonomy guidance for the subject-and-scope of your taxonomy is harder than it ought to be. After all, many organizations are in the same or similar businesses, with limited types of organizational structures, and limited types of data or content. But business or organizational taxonomies have been left far behind their scientific or worldview counterparts, such as the Dewey Decimal System for libraries or Medical Subject Headings for medical literature. So you must expect that an off-the-shelf taxonomy for your business and domain does not exist today.
As a taxonomy developer, you need to look to these sources of assistance:
- Your colleagues, to tell you the ways they need to get access to information
- Subject area expertise, both inside the organization and out
- Taxonomies and classifications that affect your domain, such as United Nations Standard Products and Services Classifications (UN/SPSC), or industry-specific Organizations such as ACORD for insurance.
- Taxonomy Development specialists, such as HighClassify Inc.
- Your content and data
- Existing taxonomies that are potentially related to your need – your company’s website map, for example
The results of these sources will produce a set of concepts, or subjects, which should form a comprehensive, but overlapping, termset describing your domain-scope. For sources outside your organization, the concepts need to fit your real world of users and content.
From the listing of concepts, next comes taxonomy organization using various techniques, including:
- Taxonomy perspectives (top-down)
- shared characteristics of concepts, particularly those from users or content (bottom-up)
- high-level unification (top-down)
These processes will identify gaps, which should be filled in, and the hierarchy and alternative entry point types and views conclude the organization step.
Validation is the (final) step before implementation. The taxonomy concepts are compared at all levels to users needs, consistency, terminology-fit, and the outside world again.
These steps are not accomplished in isolation and each step will cause some revisit of earlier steps. An underlying check….
What is a Good Taxonomy?
One That Satisfies User Information Needs.
[John Lehman is Co-Founder and President of HighClassify Inc., a provider of taxonomy and content classification services and solutions. He previously founded Verity, Inc. and Sageware, Inc.]