The backlash against the buzzword of the year -- big data -- to some extent has begun. As Forrester's James Kobielus so eloquently put it, “Some shiny new thing gets built up until it’s too big for its britches and then we delight in shooting it down.” Practitioners who live and breathe content and information management technology on a daily basis can become tired of the hype behind new labels and acronyms as they get more and more attention from technology marketers and pundits.
But mild annoyance with an overused label does not negate reality. “Big Data” is something very different for most information professionals. Starting the learning curve now -- rather than when a crisis or new initiative pops up out of nowhere -- is what practitioners in the IT, compliance and business process automation arenas need to do.
Why is “Big Data” Different?
The “Internet of Things” opens the door to a whole new world of electronically stored information (ESI)
One aspect of big data is the tremendous volume of new digital activity being collected, logged and potentially analyzed from a new generation of digital devices. From GPS-enabled phones and vehicles, to sensors and monitoring tools with their own IP addresses, data is being generated, collected and stored everywhere we look.
Utility systems, transportation systems and retail systems are all collecting information about how we use resources, shop and travel. Information professionals who focus on compliance, privacy and retention policies need to recognize this new source of ESI and include it in their corporate data mapping as a potential source of e-Discovery requests.
Big Data infrastructure applications were created to solve problems that haven't existed before
New database architectures and software frameworks have emerged, designed specifically for data heavy applications that are distributed across countless nodes and servers, storing vast amounts of data. The rise of social business, mobile transactions and cloud services has accelerated the need for a new approach to data scaling and storage.
Most of these new frameworks and databases have their roots in the open source world, where developers routinely create new approaches to problems that haven't hit mainstream. Companies such as Yahoo, IBM, Apple, Amazon, Twitter, eBay -- the companies that represent many of the biggest providers of online communication and transactions we use both as consumers and professionals use and contribute to these innovative, open development initiatives. As software philosopher Eric S. Raymond once said, good software starts by scratching a developer's itch. New problems demand new solutions. New scale of digital content, information and communication demands new architectures.
Other people's data can help your business
There is a school of thought that states that in 2012 only a handful of the very largest global organizations or popular social network platforms are really facing the big data problem. But then we can recall the famous prediction in the late 1950s, that “there is a world market for about five computers.”
Innovation is driven by creatively applying new tools to new problems, thereby discovering new opportunities. Many companies will not, in fact, have in-house big data issues in the short term. But access to data from vast social platforms for research or marketing purposes, often via APIs, means big data can be used for better business decisions.
Optimizing lead generation campaigns, analyzing content for sentiment or trends, and getting real-time insight into an industry rather relying on old historical data can be tremendously valuable. C-level executives need to start thinking about how access to such data can be used to improve analytics, reduce costs, find new revenue opportunities and extract value from masses of information.
Open data initiatives from a broad range of national, local and municipal governments also open new opportunities for pattern analysis, new product or service delivery, allowing companies to dig into the rich statistical content collected and maintained by public institutions.
The lines between structured and unstructured data are blurring
This has the potential to be quite disruptive to the content and information management technology market. The decades-old divide between the database gurus and the content management experts will begin to dissolve. A new generation of database architectures has evolved to support unstructured content and denormalized data at large scale. Social platforms and cloud services will continue to drive this need and adoption of new architectures.
The emergence of “Linked Data” in real business use cases, and new engines to help connect content across multiple systems also help blur the structured/unstructured lines. Initiatives such as the Apache Stanbol project harness development efforts from a range of cutting edge content management developers for the purpose of enhancing unstructured documents with semantic information from any number of sources.
Think about the implications for a case management system ingesting a new complaint, and having the analytics engine automatically identify and link names, places or products to desired internal or external known data. Automation of digital processes can accelerate by blending the intelligence an organization holds from both the structured and unstructured resources available to them.
What to do next?
Look beyond buzzwords and understand the potential for innovation, but also for new risks. Understand what this can mean for your organization. Think about how new approaches can help create new customer or citizen experiences by analyzing a bigger range of available data, and extracting previously unseen patterns or value.
Don't panic; but don't ignore big data. Five or six years ago many information professionals chose to put the rise of “Web 2.0” or social business on the back burner, thinking it didn't apply to their business or their role. Today we see that online communication and collaboration is becoming the norm. Learn from previous disruptive patterns -- learn and embrace, don't ignore and fall behind. Information professionals need to think about plans for tomorrow's challenges.
Big data may not be your problem in 2012, but what about 2015?
Editor's Note: You may also be interested in reading: