There’s been some debate on the role of metadata in content management: is metadata the future of content management, an integral part of the content, or are we making an artificial distinction?
Let’s start by setting aside the technological issues, because these are largely irrelevant. Metadata may be stored in a database separate from a file, or in a distinct table, or marked up differently, but this isn’t the determining thing that makes it different from other data or content held in a system. It can even be an inherent part of a document. What makes metadata different is how it’s used.
Metadata is used for classification. It’s used to relate one piece of content to another and to help people and systems find relevant information. If it’s not serving that purpose then it’s not metadata.
This may lead you to the conclusion that I’m saying everything is metadata. But it’s not. Some content that is marked up as metadata isn’t really metadata at all, or is at best poor metadata. Take a look at the UK’s National DNA Database, for example. This database records ethnicity and skin colour as a way to search for people, but one person’s view of their ethnicity may not be shared by another’s. This disparity has effectively rendered this metadata set useless. The records on ethnicity and skin colour are potentially useful as content, but unreliable as metadata.
So if you’re looking to define metadata types and corresponding taxonomies for your content, you have to consider how those doing the classification will apply the metadata and how other people are going to use that classification. If it’s not useful, it’s not metadata. If it is useful, you’ll be on your way to managing your content.
