The explosion of Big Data in the past few years has added a fourth “V” to the traditional “three Vs” of Big Data — Volume, Velocity and Veracity. That fourth V is “Value.”
In a presentation at this week’s Gilbane 2012 Conference in Boston — “Big Data for Enterprise and Marketing Applications — Three Views,” Stefan Andreasen, founder/CTO of Kapow Software, explained why Big Data is becoming such a value-add for enterprises. According to Andreasen, 90% of all data on the planet has been created in the past two years.
“There is more to Big Data than what people write on Facebook and Twitter,” said Andreasen. “There are blogs, forums, news sites, portals, competitor and government sites.”
Relevance Key to Big Data Value
Andreasen said the trick to obtaining the value of Big Data is to focus on the data that is relevant and important to you and is also recent. “If data is too old, it’s irrelevant,” he said.
As an example of the importance of relevance to the value of Big Data, Andreasen described a contest Netflix ran where the company would give US$ 1 million to anyone who could develop a more accurate algorithm to predict what movies customers would want to watch. A Stanford University professor created two teams of students to take the challenge. One team used the best math and analytics possible while the second team used the same data sources as Netflix but was instructed to add one extra set of data.
“The second team did better and created an algorithm almost as good as Netflix’s,” said Andreasen. “Think about looking for the most relevant, right data before crossing the data.”
As another example of how finding relevance within Big Data increases its value, Andreasen said by tracking the pricing of the same microwave oven on the Sears, Best Buy and Amazon sites he found that in 24 hours Sears never changed its price, Best Buy changed the price twice and Amazon changed the price nine times.
“Which of these three retailers is the most successful?” asked Andreasen. “Amazon is the most successful because they have their finger on the pulse of consumer interaction data, including real-time pricing data from competitors like Best Buy. Big Data is a black hole if you’re not focused on the right data and automation.”
New Volumes Require New Methods
Sitecore Analytics Director Ron Person described how new volumes of Big Data will require new analytical methods. “Eighty to 90% of Big Data is unstructured,” said Person. “We are at the petabyte level, which is 10 followed by 15 zeroes. WalMart creates 50 million filing cabinets worth of data every hour.”
Person said Big Data is defined by four attributes — mass volumes, complexity (such as video and references to people), rate of increase (expected to reach a level of 44 times its 2010 amount by 2020) and inability to analyze or store by traditional methods. This will require artificial intelligence, according to Person.
“Current business intelligence provides comparative charts and creates trend lines,” he said. “The future will be self-learning analytics and self-tuning sites. We will look for patterns we don’t know exist — the ‘unknown unknowns.’ Known unknowns are when you know what you’re looking for. Unknown unknowns are finding patterns you don’t know are there.”
Big Data – The Industrial View
Brian Courtney, GM of Industrial Data Intelligence for GE, discussed a critical but less-publicized aspect of Big Data — its role in automating the monitoring and analysis of industrial data. He said GE uses both batch processing, the offline analysis of “massive repositories of data for patterns and insights,” and stream processing, the real-time analysis of “web-scale data to identify trends and anomalies as or before they occur,” to determine data patterns that indicate likely failures in GE technology such as electricity-generating turbines and airplane engines and then monitor equipment for those patterns in real time.
All three panelists agreed that not every enterprise decision can or should be driven by Big Data. “Somebody’s got to come up with something nobody’s ever seen before,” said Person. “Purely data-driven decision-making is evolution, not revolution. Use human gut-level insight and test and refine it with Big Data.”
- Are You a Top 20 Document Management Vendor? [Infographic]
- Customer Journeys Trump the Traditional Sales Cycle
- Is Box Writing Enterprise Content Management's Obituary?
- Can Akumina Make SharePoint a Web CMS Contender?
- Does Cloudera Need to Cool It?
- Yammer: SharePoint's Social Collaboration Savior? #SPTechCon
- 12 Steps To A Successful ECM Deployment #gartnerpcc