Ask 100 people to define big data and you'll get 100 answers, including no answers at all. In theory, big data can help organizations make decisions faster, easier and more accurately. But in practice, faster and easier is just an unrealized goal — and creating business value is often even more elusive.
Riddled with hype and inflated expectations, big data has been nothing more than a nebulous concept for many organizations.
Rather than successfully analyze a complex set of datasets to discover information that could help teams make better decisions or find new patterns, floods of data often overwhelm the people struggling to make sense of it.
Data as a Tool
Even Gartner agrees big data still has a long way to go. It earned a spot on Gartner's “Peak of Inflated Expectations” in its Cycle for Emerging Technologies Map last year and improved only slightly in 2014, moving up to the tip of the “Trough of Disillusionment” category. What's that mean? In brief, that big data is moving ever so slightly closer to wide scale usage.
At this point it's fair to say big data is still a big mess — hard to define, hard to structure and harder still to value. And no one seems to agree on what to do with it, how to use it or even how to securely store it.
Is big data a source of valuable insight or an overhyped trend? Essential or optional? The answers are challenging because big data reflects the complexities of information generated by ever more unpredictable human behavior.
So where do we stand now? As we wade our way through the hype and disillusionment to the dawn of real business value and insight, what remains the biggest challenge … or, as the case may be, potential opportunities?
What's the biggest problem with big data?
Bruno Aziza, Chief Marketing Officer at Alpine Data Labs
Aziza is responsible for Alpine’s worldwide marketing strategy, product marketing, positioning and growth. Before joining Alpine, he ran marketing at Microsoft, SAP/Business Objects and Apple. He comes to Alpine from SiSense, where, during his tenure, the company grew by 520 percent in 2013, raised $10 million in a round led by Battery Ventures and was named Top 10 big data Analytics startup by Inc., CRN and CIO Magazine. Aziza was educated in France, the UK, Germany and the US and loves to play soccer. Tweet to Bruno Aziza.
Value. Getting value back to the business is No. 1 in my mind. The reason why so few companies are getting value out of big data — just 4 percent according to research — is primarily driven by two reasons in my opinion: the difficulty of working with data at the speed of business and the lack of maturity of big data use cases.
On the data issue, I think you'll find that data scientists still struggle immensely to optimize the data pipeline end-to-end. They use disparate tools to prep data and often find they have to move it into dedicated environments before analysis.
These are the very first steps of big data mining, and many companies are losing their competitive advantage from the get-go because this process is slow, expensive and often yields suboptimal returns.
Much time is spent heaving data to the crunching factory but often only small samples of data make it and can be analyzed in a timely and relevant manner. In the big data era, sampling becomes an abnormality and many will find that they can't solve today's problems with yesterday's approach. Sampling and data movement make the road to Insight City longer and fraught with peril.
It takes might, patience and persistence to organize the data, but it also takes feats of organizational psychology. You probably didn’t guess that the human psyche has much to do with data analysis. Yet it does. From the very start, business and data teams need to work together to figure out which data is relevant to gaining crucial insights. A big issue here is getting geeks and suits to work effectively with each other.
Communication between parties is often inefficient, with too much back-and-forth. The data pipeline slows to the speed of molasses, lacks agility and moves far slower than the speed of real business. Things move along even more slowly if Hadoop data is involved, which typically requires analysts to spend time learning to tame its complexity.
The role of collaboration is often under-estimated, yet it is the one tool that our civilization has resorted to for years to solve conflict, find common goals and build sustainable performance.
This last point is critical. Getting business value out of big data is not a project. It's a choice that executives and employees make by deciding to build a new engagement model for insights and analysis. I think that's why it's so hard. Here we are talking about bits and bytes, when the biggest problem is a more organic and natural alignment between people, process and data.
Peter Allen, Solutions Architect, Annese & Associates Inc.
With a bachelor's degree in visual communication technology, Allen initially planned to design electronic presentations. But on his first job at a firm that creates computer-based training for manufacturing companies, he realized his real passion was in IT. He spent eight years designing and implementing an ASP data center to host intranet websites for General Motors and worked as a storage and virtualization engineer at a small telecom billing company before joining Annese, a Clifton Park, N.Y. provider of integrated communications systems. When he's not designing data center solutions, he enjoys traveling. Connect with Peter Allen on LinkedIn.
Variety. Today, the term “big data” is being tossed around office cubicles more than paper airplanes.
Gartner classifies big data initiatives by volume, velocity and variety — otherwise known as the “three V’s.” Cisco offers a fourth: value. Data translated into business value can take the form of higher profitability, faster time to market or greater business and IT agility. Every day our world becomes a little more connected. With IP-enabled smart devices stringing an invisible thread between people and processes around the globe, our daily lives are quickly turning into a giant game of telephone where everything we do and use can intelligently talk to one other.
Every industry is seeing the impacts of big data and what Cisco is calling the Internet of Everything (IoE), defined as the "networked connection of people, data, process and things." IoE imagines a world in which by 2020, the number of "things" connected to the Internet will exceed 50 billion. It is not merely enough to have access to all these new connections. Mining that data into actionable intelligence is where the real power of IoE lies.
According to Cisco, “Once the value to the business is understood, juggling higher data velocity, volume and/or variety becomes an engineering problem.” It seems that with big data most of the challenges with velocity and volume have been addressed with software.
Variety is the most challenging because it still requires people to be able to put it into context or domains. Variety simply refers to the many diverse sources that we gather data from — spreadsheets, emails, databases or rich media like audio and video formats, to name a few. Software is just not able to properly mitigate the complexity of the variety of that unstructured data — that is where the human element must come into play. Mining and analyzing the volume and velocity of big data is the only real way to extract real actionable value.
Julie Freeman, Artist, Translating Nature
Freeman's work spans visual, audio and digital art forms and explores how science and technology changes our relationship to nature. Often working collaboratively, she experiments in transforming complex processes and datasets into sound compositions, objects and animations. For the past 15 years she's focused on questioning the use of electronic technologies to translate nature – whether through the sound of torrential rain dripping on a giant rhubarb leaf or using scientific techniques to lead an audience to manipulate their senses. Her latest project, co-commissioned by the Open Data Institute (ODI) and The Space, a website for artists and audiences around the world to create and explore digital art, launched today.
For Freeman and other digital artists, big data is less of a problem than an opportunity. As ODI President and co-founder, Sir Tim Berners-Lee said, “Artists wake us up to all that happens in the world. The Space and the ODI make that happen on the web." Tweet to Julie Freeman.
Volume and Velocity. My general thoughts are that the term big data is marketing puffery. When broken down it's fairly meaningless and certainly unscientific. What is big? What is data? So the problem with big data is its wooly term.
As the world becomes more influenced by data-driven decisions, we need to be able to describe data in more meaningful ways. To describe it we need to understand it. Understanding data is the crux of what provoked the need for the Data as Culture art program at the Open Data Institute. I am an artist who uses data as an inspiration and as an art material. My work explores how life data (data from living things) enables us to translate nature to provide us with new perspectives.
Big data is a broad and overused term. It is easy to forget that data are collections of values that allow us to understand the subject it is describing more easily or more fully. It is the form that communication takes between the natural world and the technological world. How do we translate weather into a digestible human form? Through a piece of hardware that measures a physical instance and represents it to us as a number — or a data point.
We need to describe data in a much more accurate and relatable manner. We would never describe a piece of furniture as "made from a material." We describe it as "an elegant 20th Century Art Nouveau chair by Henry van de Velde". We give it provenance, origin, classification. And the same goes for data. We need to be describing its form, its origin, its delivery method, as well its more technical format properties.
Alongside the Open Data Institute and Queen Mary University, I've been working on a taxonomy of data. This arose from the desire to be able to better describe the material I work with as an artist. It is important to me to fully understand the materials and mediums I work with before I complete a work. And how I describe the work to others needs to help them understand what the concepts are behind the work. What was my intention? Why did I do this?
Data is different from other materials in that it is intangible, invisible. It is hard to imagine that an artist would want to work with something she can't smell, hear or touch. But for me that is the exciting challenge. I’d like to encourage others to also be more precise when referring to data or big data.
(Ed. Note: Yes, pictures do say more than words. So let's look at Freeman's work.)