The Big Data movement has legions of companies showing renewed interest in harnessing the value of information. Followers of this trend know that the 3 Vs of Big Data (Volume, Variety and Velocity) have converged to make it feasible to utilize predictive analytics on everything from social media content to influenza outbreak data. The excitement is both palpable and justified because companies can now analyze data in minutes compared to what would've been practically impossible just a few years ago.

But, the massive amount of information that literally fuels big data analytics doesn’t come without inherent risks and costs. As an example, the same data that may be mined for big data purposes has a complex (and overlapping) series of risks due to potential e-Discovery costs, security breaches, storage requirements and the like.

Fortunately, these competing notions are finally being addressed by the emerging concept of information governance. This topic is getting a lot of attention since it has the potential to unify a range of disparate stakeholders throughout the organization, all faced with differing interests in the use of data.

Gartner provides the following definition of information governance:

Information governance is the specification of decision rights and an accountability framework to encourage desirable behavior in the valuation, creation, storage, use, archival and deletion of information. It includes the processes, roles, standards and metrics that ensure the effective and efficient use of information in enabling an organization to achieve its goals.”

More simply put, the goal of information governance is to optimize the value of information, while simultaneously minimizing the associated risks and costs. To see how a company might construct a business case to address information governance it’s useful to break out the different business drivers separately.

Minimizing Risk

In contrast with the value of information, which will be explored later, data inherently has latent risk due to the ability of information to be used in a range of unintended ways.

One common scenario is the risk that confidential information might be inadvertently disclosed, resulting in a range of negative consequences for the impacted enterprise. Costs for this type of breach can easily range into the millions due to the confluence of fines from regulatory agencies, lost corporate value due to brand damage and lawsuits from aggrieved victims of the data loss.

A few years ago one of the largest payment processors in the country (Heartland Payment Systems) reported that hackers had accessed its computer system, exposing millions of credit card numbers in what is believed to be one of the largest hacking-related security breaches ever. The Heartland breach apparently involved 130 million credit and debit card numbers.

The company was then sued by shareholders and the stock price dropped from about US$ 17 a share to under US$ 5 after the breach. To add insult to injury, the company also paid an estimated US$ 68 million in costs to settle consumer and credit card claims.

Another common area surrounds the risks associated with potential electronic discovery that may be required in response to litigation or regulatory investigation. A recent survey revealed that the document review process alone cost on average US$ 18,000 a gigabyte, meaning that with collection, preservation, hosting, etc. the e-Discovery costs can easily exceed US$ 20,000 a gigabyte. This means that even the most vanilla e-Discovery event can easily run into six figures.

In both scenarios it’s easy to see how the ability to defensibly delete information, which doesn't have corporate value, can reduce the latent risk that the data might be lost, stolen, or required to be produced in an expensive litigation proceeding.

Reducing Costs

In addition to risks associated with data retention, there are also a range of hard costs that occur automatically. While the “storage is cheap” battle cry has been universally shouted from the mountain tops, the real costs to buy, maintain, back-up and protect corporate data is far from inexpensive. Even assuming declining storage costs (due to the cloud paradigm and hardware commoditization) any potential storage savings quickly evaporate given the increases in data volumes, which are doubling every 18 months.

In the end, reducing corporate data storage requirements means the ability to reap savings along several vectors, including storage management personnel, backup requirements, business continuity planning, etc.

In addition to minimizing hard costs with data minimization, additional benefits can accrue from simply reducing the corporate complexity due to managing superfluous terabytes/petabytes of data. Other soft costs will inevitably accrue to those companies that are more readily able to find useful corporate information, when extraneous data is removed from the mix. It’s this type of intelligent information access that (while sometimes hard to quantify) does make a meaningful impact in the comprehensive cost of doing business. In combination, reducing the hard and soft costs of data management is an easy way to establish one prong of the information governance ROI formula.

Optimizing Information Assets

Accordingly to the preceding business drivers it’s easy to see how a “less is more” mentality can move the needle for organizations as they try to reduce information risk and lower associated data costs. And yet, there’s an inherent tension in this governance formula since it’s easy to see how an overzealous data minimization mandate could likely impede larger corporate objectives of maximizing shareholder value by producing products or services. This is why risk/cost reduction can’t exist in a vacuum; it must be constantly weighed against the potential value of the information.

It a recent paper entitled “Finding the Hidden ROI In Information Assets,” The Sedona Conference examined the challenge of optimizing information assests:

[W]e introduce the concept of adopting an option value approach as one key to doing better in meeting the information governance challenge -- by identifying, calculating and leveraging the option value of corporate information assets. Option value, as defined here, is simply the long-term strategic value of such assets. Organizations typically leverage information fairly effectively over the short-term: e.g., e-mail for current communications, financial data for the latest reporting periods ...

But once the data’s short-term use is expended, the data is often stored away and rarely reassessed for any long-term strategic value. Left ungoverned, this potentially valuable asset is not only wasted, it also may become a significant liability. Through proper information governance, however, organizations can realize additional benefit from their information assets, thus increasing the option value of those assets while reducing potential risk."

Precisely how a given company can best utilize its information assets will vary by company , but the big data trend certainly sheds some light on one tranche of useful scenarios.


Organizations are beginning to align content creation with categorization and then defensible disposal to efficiently “govern” their data, before the data effectively becomes unmanageable. In order to balance out this information equation it’s imperative to first quantify each of the three business drivers, by breaking down the risk, hard/soft cost and information value categories. Only then, can a ROI formula be divined, which can then lead to eventual information governance solutions.

