Let’s face it. Most people don’t know what Hadoop is, how big a petabyte is or can even explain what wins big data can provide for their companies
And that’s OK.
When it comes to big data and analytics, there’s the hype — and then there’s the reality, said Tom Davenport, one of the world’s foremost analytics experts and author of Big Data @ Work.
Digesting Big Data
“We know that the NSA is taking advantage of all of that (big data and analytics), but we don’t know if anyone else is,” he said during yesterday’s Harvard Business Review webcast, Big Data at Work: Dispelling the Myths, Uncovering the Opportunities.
Then like most world class professors, he broke big data and analytics down into digestible terms that earth people can understand.
We picked five of the insights we gleaned to share with you.
- Unless you’re a big data engineer or administrator, you probably don’t need to know all that much about Hadoop beyond Davenport’s simple, graspable definition: Hadoop is a technology for spreading big data across commodity servers for processing.
- It’s rare to have a conversation about big data without someone making reference to petabytes and zettabytes. Guess what? You may not have to worry about knowing what a petabyte actually is. (A petabyte (PB) is 1015 bytes of data, 1,000 terabytes (TB) or 1,000,000 gigabytes (GB). bites; a zettabyte is approximately equal to a thousand exabytes or a billion terabytes.) Instead, said Davenport, you can define big data as more data than fits into a structured database, that moves too fast for a data warehouse or that requires a new technology, like Hadoop, to manage it. ”Big data is so big that most people aren’t familiar with the terms (that describe it),” Davenport said.
- We may know how to analyze big data, but we don't necessarily know what to do with the insights. “People look at social media analysis and say it’s up, or it’s down,” said Davenport, “but we don’t yet know what to do with that.”
- Big data’s biggest wins may come from making many small decisions vs. one that’s huge. The majority of big data driven decisions will be recurring, made at speed (in milliseconds), and at scale; actions will be taken automatically (vs. reviewed and approved by an individual). Davenport gave examples of a UPS driver’s delivery route which can be optimized in real time using things like weather or traffic data.
- When you embark on your big data journey, should you look at your data assets and try to glean insights or should you start with a question and then gather your data? It’s not an either or, you should do both, said Davenport.
This, of course, is just a peek at what Davenport has to share — that’s why he wrote a book. We’re not going to tell you to buy it, but it’s one we’ll keep handy.