Lots of people talk about taxonomy in content management, but its meaning and importance can be confused, so I’m going to try and provide a more concrete definition.
Taxonomy is about classification. It describes ways of naming, arranging and ordering things within a system. Those things may be books in a library, or plants and animals in biology.
Taxonomies are usually hierarchical and based on restricted terms. In biology, for example, there are a limited number of kingdoms, phyla, and classes. In the Dewey Decimal System, these restricted terms are based on numbers: Phonology (414) is an element of Language (400). These relationships are often described as ontological.

Some things can belong in more than one place in certain classification systems. For example, a dish on a menu may be available as both a starter and a main course. These ways of looking at the item are often termed “facets”; so a Greek salad is both a type of dish (salad) and a course (starter).

The “thing” in question may also have one or more synonyms which form part of a taxonomy. There is some overlap here with the functions of a thesaurus.
A recent innovation brought to the fore by websites like del.icio.us and Flickr is the concept of “folksonomy”. This allows participants to “tag” any page, image, or document according to their own vocabulary. As more people tag pages, these become related by shared terms, building up the classification system. This taxonomy is unrestricted: what one person calls design, another may call art, creativity, or even Photoshop.
How are taxonomy and content management related?
Content management typically addresses a number of key areas:
- Production: providing the tools to enable people to create content.
- Authorisation: ensuring that only relevant people are able to view and amend content.
- Workflow: delivering content to people within the system once it passes through certain stage gates.
- Storage and retrieval: providing a mechanism to store, find and re-use content held in the system.
Content management systems (CMS) can make use of taxonomy for authorisation and workflow, but they are dependent on taxonomy for storage and retrieval. Each page, document, or other asset held in the CMS is stored according to a predefined classification method.
There are different metaphors for this method. The taxonomy may be represented as types of document (e.g. contract), as departments with their own document silos (e.g. legal, marketing, human resources), as folders that represent some other business function (e.g. project start-up, initiation, execution, closure), or flagged with a value from a predefined list. Most CMS will use a combination of these classification systems so that content can be retrieved more easily.
Taxonomy vs. metadata
Taxonomy is often applied as metadata: that is data about data. Office documents have metadata assigned to them such as author and revision date. (To see this in your Office application choose File >> Properties and click on the summary tab.) Web pages also have metadata. There are a number of ways to view this, but the simplest is to choose View >> Source in your browser menu. Close to the top of the page you’ll see some HTML tags beginning meta name=”". The name attribute is the type of metadata they describe (description, copyright, keywords, author, etc.) while the content attribute holds the metadata itself. Metadata is often assigned from pre-designated classifications, so is an important part of any taxonomy project.
Where can I see taxonomy on the web?
In its simplest form, taxonomy on the web is represented as website navigation. Many of the techniques applied to develop taxonomy are used to develop more user-friendly websites. This includes activities like card sorting.
More complex taxonomies can be found at dmoz.org (a directory of websites) or, for medical research, GoPubMed.