In last month's article (The Semantics of Content Management: What We Mean and How We Say It), I discussed how we often trip over our collective tongues with our use of language and terminology when discussing content management and related technologies within our enterprises. This month, I thought I would address one of those terms that could possibly cause some confusion: content analytics.

Content analytics is not really a new term. I recall attending a very interesting content analytics session at a business intelligence conference in the UK back in 2007. At this year's Info360 conference in Washington D.C., there were a number of sessions addressing this area. So while content analytics might not be new, it is perhaps increasingly fashionable? I suppose the first question must be -- what is it?

Content Analytics Defined

I will provide you with that age old consultant answer to any question: "Well, it depends..."  Which of course is only marginally better than responding with "What do you want it to be?"

Content analytics can be a broad church, with many different types of believers. It can span a broad panoply of content management related technologies; indeed, last year's AIIM Industry Watch report "Content Analytics -- research tools for unstructured content and rich media" mentions all of the following:

  • Web analytics
  • Digital asset management
  • Faceted search tools
  • e-Discovery tools
  • Content de-duplication tools
  • Content assessment
  • Metadata tagging
  • Text analytics
  • Social media monitoring
  • Digital forensics
  • Sentiment analysis

A widely flung net indeed!

As you can tell from the title of the AIIM report, it examines content analytics from the perspective of tools that can analyze your content from a research perspective. It is often suggested that content analytics is bringing together content management, business intelligence (BI) and search technologies; but to achieve what end?

What Content Do You Want to Analyze -- And How?

The aim of BI is normally seen as being the discovery of trends, both historic and with suitable analysis and extrapolation, of future options and possibilities for those trends. It is seen as providing actionable information in order to inform decision making.

BI is firmly embedded in the world of structured data, relational database systems and data warehouses. So we might extrapolate from this that content analytics is about doing the same for our massive, and ever increasing stores of unstructured data. Analyzing the "internals" of our content items, for example using sophisticated text analytics on our burgeoning document stores, in order to discover new insights?

This to me is the research oriented view, and I think it's going to be difficult to achieve and difficult to measure the outcomes of such efforts. In this context, at an Info360 keynote, IBM noted that their particular content analytics solution includes advanced natural language processing technology developed for the Watson game show winning super-computer! If you fancy a rest from reading text, check out this IBM video on YouTube.

Analyzing the Use of Content

Another view of content analytics is one that is more akin in my mind to web analytics. It's about analyzing how people are using or interacting with content. As Alan Pelz-Sharpe of the Real Story Group mentioned in his content analytics session at Info360, this is already achieved to some extent in the marriage of BI "reporting dashboard" technologies with business process management or workflow technologies.