computer with streams of code
PHOTO: Markus Spiske

The fallout from the Facebook data harvesting scandal continued last week, with the announcement that Cambridge Analytica, the company accused of harvesting data from personal Facebook accounts to inform Donald Trump’s presidential campaign, announced it was shutting down.

In a statement from the London-based Cambridge Analytica, the company said it was “immediately ceasing all operations” in both the UK and US. The company said it had lost clients due to the controversy that erupted in March, making it “no longer viable to continue operating the business.”

However, the company insists what it was doing was entirely legal and claims the downturn in its fortunes was largely a result of negative media reporting. The statement continued:

“Over the past several months, Cambridge Analytica has been the subject of numerous unfounded accusations and, despite the Company’s efforts to correct the record, has been vilified for activities that are not only legal, but also widely accepted as a standard component of online advertising in both the political and commercial arenas.”

Data Harvesting Hits Facebook

Here is the crux of the problem. Harvesting customer data, for the moment at least, is not illegal. After May 25 and the introduction of the GDPR regulation in Europe, the rules around the harvesting and use of information gathered from citizens of the European Union will get a lot tighter and companies on both sides of the Atlantic will, despite efforts over recent months, struggle to observe the law.

In the meantime, the other victim in the Cambridge Analytica controversy is Facebook — if Facebook can ever be described as a victim. And while it is impossible to gauge the long-term impact of the scandal on Facebook, the short-term impact is a little bit clearer.

Facebook search queries
Search for Facebook volumesPHOTO: SEMrush

SEMrush, the Trevose, Pa.-based builder of a digital marketing toolkit for search engine optimization (SEO), pay-per-click and content marketing professionals, shared some internal SEO research with us. It analyzed how the Cambridge Analytica news affected user searches in Google, looking at the number of searches of Facebook, delete Facebook related searches, and how the #deletefacebook hashtag was formed in March.

#deletefacebook
Thegraph with #deletefacebook, which coincides with the time when media were flooded with news on Facebook's data leakPHOTO: SEMrush


The results show significant spikes in all graphs. It also looked at the most frequent queries containing "Facebook" in March. What it found was a surge in US users searching for ‘how to delete Facebook,’ at five times the rate of a year ago. "There is also a spike on the graph with #deletefacebook, which coincides with the time when media were flooded with news on Facebook's data leak. It's intriguing, how the situation in Google searches changes in the nearest months," Olga Andrienko, head of global marketing at SEMrush said.

search query on how to delete facebook
US users looked for ‘how to delete Facebook’ in March 2018 five times more often than a year ago.PHOTO: SEMrush

Related Article: Marketers, Data Collection and the E-Word: Ethics

What Other Companies Are Data Harvesting?

Though Facebook is at the center of the tempest, it should not be forgotten that other companies, particularly large tech companies, are also harvesting information too, but appear to have (for now) escaped public outrage as people focus their anger on Facebook. A recent article by Christopher Mims in the Wall Street Journal (paywall), for example, cited security sources that claimed Google likely has so-called shadow profiles of users to at least the same level of specificity as Facebook.

In fact, given Google’s stranglehold on search and email, as well as its other apps like Maps and YouTube, it is possible for Google to build profiles that cover an entire person's waking and sleeping life. It is impossible to quantify how much information it harvests, but an investigation by the British Sunday newspaper Mail on April 21 uncovered some uncomfortable facts. A researcher with the newspaper found Google had logged every journey he had made in the past four years, noting how he travelled — public transport, walked, drove, cycled. It even had records of a funeral he attended and a hospital appointment.

The author calculated that if he printed out the information Google had gathered on him over the last 12 months on a standard A4 page, he would have a stack that was 189 feet high. That means a stack of 7 feet, 9 inches every two weeks of his data alone.

Given its size and reach, Google is clearly an extreme example, but it seems that just about every company that is interacting with the public online is harvesting personal information. Tim Lynch, president of VR gaming computer provider Psychsoftpc, who has a Ph.D. in the psychology of computers from Boston University, put it bluntly: once you go online, your data is going to be harvested.

“Everybody is at it,” he said.“Twitter, Linkedin, Microsoft, insurance companies, supermarkets, drug stores, just about anybody with a web site or loyalty card or safe driver program or social media site. Microsoft collects data from Bing searches, Edge browser use and Cortana, Google collects data from searches, Chrome Browser use, Gmail, Google Apps, Android phones and Chromebooks.”

Unless companies can aggregate the data, though, it's not much use. But that issue has gone away he said, “with big data, supercomputers and artificial intelligence companies can merge data sources and get a complete profile of people including likes, health, habits, schedules, routines and problems which can be used to deny insurance, target market, manipulate, influence, profile and surveil.”

Related Article: How GDPR Will Help Rebuild Data Protection and Customer Trust

Degrees of Data Harvesting 

Some would argue that brick-and-mortar stores have been gathering data on people since before the web, but there is a difference between what data harvesting tech companies are doing and those practices, according to Chirag Shah, associate professor of information sciences at Rutgers University.

“Even our supermarkets do that [data harvesting] to learn about our shopping habits and preferences. So, what’s so different about the online and social media world? A supermarket won’t know my age, where I work, and who I talk to outside of the store,” he said. “These things are possible to collect and/or learn in the online environment, often with very little effort. Recently, MoviePass CEO accidentally slipped out how it tracks users to collect their data beyond the app and hopes to make money off of it."

He said researchers also harvest personal data, but there is, he noted, a significant caveat. “As a researcher, I also harvest data from users, but I do that by obtaining permissions from my Institutional Review Board (IRB), our ethics board, as well as the consents from those users. Where things go wrong is when people do data collection/harvesting without properly informing the users." 

Scott Relf and Renee Relf are co-founders of PikMobile, a dual-function mobile platform that allows users to share content through a viewing platform. He said it is possible, and even likely, that Facebook doesn’t know everyone that has been harvesting data from its platform. "The harvesting of Facebook data was relatively common when the Cambridge Analytica data issue took place. It is very likely that many other third parties have done the same thing, and Facebook either doesn’t know who did it, or they do not want to disclose who did,” he said. “To be fair, Facebook has been clear that they have taken steps to prevent additional such events in the future. However, it isn’t possible to put the genie back in the bottle; that data that was harvested is already in the possession of those third parties."

Facebook's First Step

The recent decision by Facebook to close Partner Categories is one of those steps. Partner Categories, which it launched in 2013, is a targeted online advertising tool that uses data from third-party sources, such Datalogix and Epsilon. By using Partner Categories, marketers could be more specific when targeting ads by using both online and offline information, such as a user’s shopping behavior. On March 28, Facebook issued a brief statement informing advertisers that third party data would no longer be available. The complete statement read:

“We want to let advertisers know that we will be shutting down Partner Categories. This product enables third-party data providers to offer their targeting directly on Facebook. While this is common industry practice, we believe this step, winding down over the next six months, will help improve people’s privacy on Facebook.” 

Those data partners included: Acxiom, CCC Marketing, Epsilon, Experian, Oracle Data Cloud (formerly Datalogic) and Quantium, among others.

Related Article: Marketers Warn: 'Day of Reckoning' Over Facebook Data Scandal Coming

'An Important Discussion to Have'

Though data harvesting has been widespread in the past, things are changing. Apart from the technology changes, there is also a mindset change — and tech companies have been fast to pick up on it. In the recent Alphabet annual shareholder letter, Google co-founder and president, Sergey Brin, said that tech companies had to take more responsibility for their actions.

“We’re in an era of great inspiration and possibility, but with this opportunity comes the need for tremendous thoughtfulness and responsibility as technology is deeply and irrevocably interwoven into our societies,” Brin wrote after quoting Charles Dickens’ novel “A Tale of Two Cities.” He concluded by pointing out that advances in technology are now a global affair with global consequences that tech companies need to address.

“… these advances, including the internet and mobile devices, have created opportunities and dramatically improved the quality of life for billions of people. However, there are very legitimate and pertinent issues being raised, across the globe, about the implications and impacts of these advances. This is an important discussion to have.”