Ready for some spine-chilling news? In the very near future someone you’ve never met, and who does not yet know your name, will be able to identify exactly who you are by the way that you write on the web. They already have a 20% shot at getting it right today.

“There are only 6.6 billion people in the world, so you only need 33 bits (more precisely, 32.6 bits) of information about a person to determine who they are,” says Arvind Narayanan, a post-doctoral computer science researcher at Stanford.

Welcome, boys and girls, to the world of Big Data, powerful analytics, and all of the perils and promises that come with it.

As I explained in my previous article, Big Data is just data. And the existence of data is not new.

It's Not New, But There's More of It

What is new is that with the advent of internet, social, mobile, telemetric and geospatial technologies, there is more data available than ever before. Add to that, that while processing large data volumes used to take a long time and be expensive, it’s now fast and affordable. And finally, there are now data scientists and data scientist teams who know how to work with large data volumes -- they’re skilled at writing algorithms, finding patterns and usurping knowledge from humungous data streams which they then either sell, use for their own purposes or hand-off to decision makers to use as they wish.

Big Data has been used by data scientists at Wall Street and Internet firms for a fairly long time. It’s what Facebook and Linkedin use to identify which of its hundreds of thousand members you might know and want to connect with. It’s what uses to process transactions quickly and to recommend products you may want to buy. (When you look at a book, for example, you’ll see a number of other books displayed just below it with a note that says, “Customers Who Bought This Product Also Bought.”) It’s how Google delivers search results in a split second. And it’s how ads geared especially toward you are selected to show up on your screen as you surf the web.

While some of the data scientists who write the algorithms that make all this happen feel that their talent is being wasted, others feel that their work is touching the world in a very important way. Consider that Jeff Hammerbacher, one of Facebook’s early employees (he no longer works there), told Bloomberg Business Week that “The best minds of my generation are thinking about how to make people click ads. That sucks.” Compare that statement with the sentiments of Serkan Piantino, the engineering manager who oversaw the development of Facebook’s Timeline, “When you work at Facebook you get to have an impact on the world. And you get to see that impact not in some distant part of the world but among your friends and family.” When I interviewed Piantino, there was no question that he was proud both of his work and the company he worked at.

Big Data Use Cases are Many

There are many use cases for Big Data that have nothing to do with getting people to click on ads. Clickfox, for example, has used Big Data and analytics to provide a major telecommunications company with the insight necessary to effectively assist its customers over the web Vs. needing to dial a toll free number because their internet-based “help yourself” service was so ineffective. Not only did the analysis reveal that 1200 of the company’s online articles needed to be rewritten, but in exactly which places they were failing. It also pointed out which of the company’s device tutorials frustrated its customers. As a result, in a single year, nearly two million calls were deflected from live agents and 10 million dollars were saved.

EMC Greenplum worked with one of its automobile insurance clients to help them more accurately set rates for drivers. The advent of Big Data and analytics tools now allow the company to use a larger number of data sets to assess risk. Not only are traditional variables like age of driver, make of car, career history and driving records taken into consideration, but data gathered from telemetric devices (which drivers seeking insurance voluntary put in their cars so that data can be gathered as they drive) and social media sites like Facebook, Linkedin and Twitter are also factored into the equation. As a result, a twenty-two year old female might now be charged less than her fifty year old tenured executive father who tweets about his antics as he tries to resurrect his youth during his decade long midlife crisis. (It's interesting to note that once an insurance rate has been set, data flows from Greenplum to Documentum xCP where the policy is then created.)

In the next article we’ll continue to look at examples of how Big Data and Data Scientists are reshaping the way we see the world, the way we interact with the world, and the way the world sees us.