Businesses that are scrambling to keep up with the quickly changing e-commerce world are turning to big data and analytics as important, if not primary tools. Collect enough data and apply complex analytical methods to it, the story goes, and you will find the answers you need to understand today and plan for tomorrow.

We’ve given these tools catchy names. Big Data Analytics (BDA) has an authoritative ring -- but the underlying disciplines haven’t changed in decades. Whatever we call it, analysis involves sampling what’s happening now and using statistical methods to derive trends that allow us to make changes to improve our results. If it doesn’t do that, it isn’t worth much.

In a BDA world, you grab every piece of data you can from your commerce, Web-based and otherwise, and then apply statistical techniques to it to tell you why your customers behave as they do and what they are likely to do if you change your approach.

What could be the problem? 

Statistics Always 'Works'

Our fascination with all things digital, however, may be blinding us to the incredible subtlety of statistical analysis and prediction … a dangerous blindness if our commercial future is in play. Most of the literature on this subject comes from the vendors selling the tools and techniques to implement it. Do a search for big data analytics, for example, and vendor-related hits nearly always top the list.

The term itself -- analytics -- masks the complexity and difficulty of generating usable information based on collection and analysis of data, especially with an uncontrolled population as is usually the case in e-commerce.  

Statistical analysis is a multi-layered discipline that goes far beyond the calculation process if it is to yield anything useful. What’s worse, statistical processes always appear to “work,” yielding results that aren’t easily recognized as invalid when in fact they may be. The commercial world needs a deeper understanding of what it calls analytics -- which is no mean feat with a subject as arcane as statistics.

Learning from our Elders

Edwards Deming's work offers one promising path to understanding. Deming was a pioneer of modern business analysis -- and a renowned statistician before that. Deming categorized statistical analysis into two types: “enumerative” and “analytic.” Each is good for some things but not for others, and each has a different threshold of required knowledge and control going in, a different set of unknowns and very different ability to accurately predict future behavior.

Enumerative analysis allows us to test characteristics and changes in sampling populations where usable answers can be generated by calculations against collected data alone. Studies of this type generally do not include sufficient information to enable prediction about the larger population. Enumerative analysis works particularly well when the population can be controlled and variations carefully introduced, as in scientific and biological experiments.

Analytic study attempts to identify and account for differences between the sample and the larger population so that test results can be used as predictive tools. This is orders of magnitude more complex and difficult than its enumerative cousin.

With big data collection and analytics, we need the latter. But we often don’t have sufficient control of the samples on which we base our calculations. This opens us to a range of errors that can render not only useless results, but damaging if we make decisions based on them. Unfortunately, making the sample larger or increasing the amount of applied computer power -- even massively so as in big data applications -- won’t do much to improve our chances.

Deming and other researchers point out that the complexities associated with analytic statistical study are often so extensive that even many advanced statistics courses tend to gloss over them, focusing instead on the process rather than the results. Perhaps this explains why the big data world tends to compress the entire process using the name “analytics”, engaging in at least as much glossing over (p. 15).

Calculation Does Not Equal Knowledge

Deming and those who have followed him point to the fact that analyzing data is only one part of prediction. As the population and behavior being studied become more complex and less capable of experimental control -- a good description of the e-commerce market -- the importance of knowledge about the population and its reasons for behavior grow.

In the big data and analytics process, no matter how much data is collected, information about the individuals in the sample will be left out of the calculation process. This makes the entire effort suspect and can introduce significant error into the results.

In a statistics class or thesis, missing this point could cost you a grade. In e-commerce, it can cost you millions of dollars or your brand's continued existence.

You’ll find this mentioned under “asking the right questions,” but the more important queries, like “what questions can analytics not answer?” are usually ignored.

"Domain knowledge" also comes up frequently in discussions of analytics. This is a techie way of saying that you must know your market and its participants if you want to be successful in selling to them. The BDA-fueled JC Penney disaster of 2012 is only one example of decisions driven by big data analytics improperly used, but it should be enough to make us much more careful.

Calculation and Knowledge: Getting the Right Balance

What we need as we follow the Pied Piper of big data analytics is a balance between what can be derived from statistical processes alone and what must be known in order to make those processes usable.

All the data analysis in the world won’t tell us how to engage and interest customers in the future unless we know more about them than big data analytics alone can tell us. That will probably push us back toward at least some of the more traditional ways of looking at customer behavior like focus groups, test and control group analysis, leavened with a dose of just good marketing experience.

Left to its own designs, the big data and analytics crowd probably won’t bring this up in their conversations -- partly because they are IT people most comfortable with their expertise … and at least partly because they probably know that dealing with the complexities in analytic analysis may take some of the bloom off their rose, leading to fewer sales.

Customer Experience (CX) is a good lead-in to this deeper look at the targets of our marketing efforts, but naming it doesn’t guarantee we will use it effectively. As passé as it might seem in today’s world, if we are to succeed, we need to think hard about the objects of our commercial affection and understand why they do what they do in order to decide what we should and should not do.

It would be a shame if the techniques that have grown out of e-commerce ended up leading us down the path, as JC Penney learned so painfully, to failure. 

Creative Commons Creative Commons Attribution 2.0 Generic License Title image by  bark