Not everyone can stand before a conference with “Hadoop World” in its title and tell the crowd of enthusiasts that Hadoop isn't the right solution for every Big Data problem.
But Ken Rudin did exactly that at the Strata + Hadoop World conference last month. To those who don’t know who Rudin is, it could have seemed like heresy -- a bit like bringing a turkey, ready for slaughter, to a vegetarian Thanksgiving.
But Rudin isn't just anyone, he’s Facebook’s analytics chief; and Facebook, for anyone who may not be aware, is one of the world’s largest users of Hadoop. In his presentation, Rudin explained that Facebook is a young enough company that it started with Hadoop, not relational technologies, and that it's now looking at relational and any other technology that meets the needs of the project at hand.
Don’t abandon your relational technologies, was his message to the more established (and older) companies in the room.
Rudin also debunked the idea that insight is big data’s important yield; instead it’s impact, he told the audience. In the press room, we overheard him tell someone that if a Facebook manager requests analytics to be run on something and they aren’t able to identify what impact the results might have, they’d be assigned a lower priority.
What is impact? It’s about moving a metric, changing a product or changing behavior, says Rudin.
It makes perfect sense, especially at a time when Big Data is coming of age. Running cumbersome queries, purely out of curiosity can’t be as high, or as large a priority, as running a business.
If you want to see Rudin’s presentation, you can access it here.
It Takes a Data Scientist
Read enough vendor announcements, and you’ll be certain that you can come up with amazing, business-changing insights without a data scientist. Listen to Claudia Perlich, Dstillery’s chief scientist speak, and you won’t be sure that data science gods and goddesses can hit the target every time -- even if they've created and trained the right algorithms.
Why? It’s simple to say, but not so simple to negotiate. Thirty-six percent of web traffic is unintentional says Perlich. Not only that, but to the untrained eye, botnet traffic and real traffic might not look all that different.
As a result, a person could, hypothetically, build great predictive models and end up with misleading results. Some people blame the data when this happens; Perlich does not. “Data does not lie, it just does not mean what you think it does,” she recently said in a tweet. Her Strata talk can be found here.
Forget Shopping at Nordstrom, You Want to Work There
If you ever feel uninspired, check out Nordstrom Innovation Lab. Though not all of their experiments are around big data, many are. (And even those that aren’t are a whole lot of fun.)
At Strata, Nordstom Data Scientist Erin Shellman and full stack developer, David Von Lehman gave a compelling presentation that not only showed why multidisciplinary teams are key to their analytics projects but also offered a peak into their EASE (Evaluate, Automate, Scale and Evaluate) methodology.
They spoke about three projects, one of which shows how they select colors and another that compares how a personal stylist and a computer come up with how personalized recommendations are made when people are shopping online to “complete the look.” It certainly beats the methodology that my favorite department store uses -- every time I go on to their site they suggest that I buy things I looked at during my last visit. So much for discovery.
While we don’t have access to a video from the Nordstrom presentation, they have provided a great slide deck that some Githubbers watch over and over again (especially slide #12) even when they, admittedly, have no interest in anything other than the special effects.
Overwhelmed by Dashboards and Data? Startup Showcase Winner Metric Insights Might Help
If you’ve ever seen Shark Tank on ABC, that’s what the Startup Showcase and Strata + Hadoop World is like, except the presentations are better and the judges know what they are talking about. This year the winner of the contest was Metric Insights. Its solutions set analysts free from monitoring all of the different analytics dashboards and applications that are in front of them, Metric Insights simply alerts users when something big has changed. We’re likely to see more applications like this. It solves the perpetual signal versus noise problem.
Why Women Are Better at Data Science than Men
It’s no secret that there are fewer women in STEM (Science, Technology, Engineering and Math) fields than there are men and that females typically get paid less than males across the board, almost regardless of profession.
But there is an exception. Female data scientists get paid more than their male counterparts, or so says Steve Hillion, vice president of product at Alpine Data Labs. And it could be that they get paid more because they’re better at what they do; that was Hillion’s argument at Ignite, the startup competition at Strata + Hadoop World.
Rather than to paraphrase Hillion’s short presentation, we’re simply providing you with a link so that you can listen to his argument for yourself.
It’s A Big Data World, Full of Innovation, Insight and Opportunity
Needless to say, Big Data is where it’s at, at this point in time. While Hadoop may not be the only framework, platform, hub foundation … whatever you want to call it, it’s one of the catalysts of computing’s new era. And that era, by the way, isn't coming, it’s here. Make sure you are too.