Big data has graduated over a few short years from an esoteric term of elite technorati to a household phrase found frequently on the front page. Whenever a story involves data -- be it Nate Silver’s election prediction triumphs, to NSA domestic surveillance controversies -- the term “Big Data” is sure to be applied. Likewise, any software product that collects, stores or analyzes data is now invariably described as a big data solution.
Not since "The Cloud" has a technology term been so indiscriminately applied, or so nebulously defined. So it’s not surprising to see some degree of buzz fatigue and predictions of a looming “trough of disillusionment.” Indeed, there’s bound to be disappointment as big data fails to deliver on its more extravagant promises. On the flip side, big data represents the convergence and culmination of some of the most significant technology trends we’ve experienced this century, and involves more potential for business transformation and disruption than we’ve seen since the internet itself.
Big data in many respects results from the convergence of cloud, mobile and social technologies. These technologies, together with new techniques for data storage (such as Hadoop) and new techniques for squeezing value out of data (data science and machine learning) offer amazing opportunities for competitive advantage.
Responding to the Challenge
Big data not only provides opportunities for new solutions and new products, but it stands poised to stimulate the economy and enrich our lives in many ways. But it’s not all an upside. Businesses that continue using traditional models may be threatened with extinction as their competitors establish a big data advantage.
Though collection, processing and storage of unparalleled volumes of data creates new challenges for IT, these are not dissimilar to the challenges created by the long-established trend of increasing data volumes and processing power. We’re used to exponential growth! But it’s not enough simply to collect and store more data. We need to establish algorithms and business processes around the data that can transform business process. Establishing these smarter algorithms is going to severely challenge many enterprises.
The biggest retailers and online businesses have shown that it is possible to create smarter algorithms for campaign optimization, customer relationship management, dynamic pricing, risk/fraud assessment and so on. But to succeed, these algorithms need to be automated and integrated into business process. It’s not enough to draw a pie-chart from the data and present it to the CMO or CEO. The data needs to be analyzed and used to dynamically generate business decisions on both a micro- (individual customer) and macro- (product line, geographical) level.
The overall methodology for creating such algorithms -- and the foundational algorithms themselves -- have been known for decades. But selecting the correct algorithms, training and validating the machine learning models, and integrated the resulting model into a business process is a brand new discipline. This demand has created a surge of interest in data scientists who have the theoretical, mathematical and software engineering skills to perform such tasks. Unfortunately, the demand is likely to exceed the supply for many years and the software frameworks to assist the data scientist are generally less than perfect. Current advanced analytic solutions often fail in scalability, ease of use or production readiness.
A Time for Action
Many enterprises are only just starting to realize that big data represents either an unparalleled opportunity or maybe even an existential threat. Amid this backdrop, the acquisition of data is critical. Enterprises that wait for a solid business plan before capturing all possible data will be at a disadvantage compared to those who start collecting everything immediately.
So the best advice for most enterprises is to make sure you have a big data capture, storage and processing platform in place immediately. Capture everything that you can now, and use technologies such as Hadoop to provide economies of storage that allow for a “keep it all” strategy.
Simultaneously, start developing in-house data science skills or find partners who can provide these for you. Many corporations have professionals with the necessary personal and professional attributes (statistical, programming, business) to develop into the next generation of data scientists.
Exactly how the application of smarter algorithms to more extensive data sets will transform a business differs across industries. But for almost all industries, this transformation is real and imminent. Even if it’s not entirely clear how big data will transform your business, it’s prudent to start data acquisition and data science skill-up immediately. When paradigm shifts occur, those who hesitate are often lost.