Algorithms That Respect Your Customer's Privacy

Mobile devices have become integral in our lives — always with us, keeping our physical selves plugged-in to the digital world.

As we increase our consumption of data on the move, so too do we increase the amount of personal information we transmit. Devices increasingly include functionality such as tracking fitness, health and well-being, which means more personal data elements, such as weight and physical activity, are being shared across apps.

Device manufacturers and operating systems work to keep personal information private, but in a quickly changing ecosystem, tying down all the ends gets complicated. And that opens the door for malicious agents to build compromising user profiles.

Given this context, protecting user privacy is just good corporate responsibility. This applies to any company with access to large volumes of user information such as ad networks or popular apps.

Implementing algorithms that inherently protect user privacy will result in better systems and user experiences overall, while increasing relevance and value to the user.

The Problem with Granularity

The growing volumes of data caused by the increasing granularity of information available on mobile brings with it two issues — handling scale and understanding the noise that can come from over-specification.

Infrastructure solutions to handle scale are straightforward. However, any knowledge-based activity involving human touchpoints can dramatically break at even modest scales.

Consider the case of classifying a user as an “upscale” or “budget” buyer based on the places she shops. If one were to do this by hand, this would become intractable for as little as a thousand users. The next approach might be to classify shops into these buckets and automatically assign users based on their shopping trends. However, even this method fails once the number of shops increases beyond a certain scale.

Granular data makes interpretation difficult by mixing in noise with the signal.

By designing systems that can handle both scale and noise, privacy protection can be built in from the start. Removing all human intervention in data processing allows easy scaling while making for greater privacy. Clustering information removes noise and at the same time, provides anonymization.

Two Machine Learning Models

To understand this further, consider the two classes of machine learning models: supervised and unsupervised.

Supervised

In supervised models, the system is trained on historical outcomes and a set of features associated with each outcome. In user-based modeling, these features could include user characteristics and behavioral attributes. Fortunately, we can replace specific information with abstract labels and do just as well.

For example, if a user has purchased a laptop and a camera but not a TV, we could replace that by an abstract vector that looks like (A:1,B:1,C:0) — no need to specify what A, B and C stand for. Techniques such as principal component analysis and information entropy reduction automatically evaluate how relevant each of these features are to predicting the desired outcome and we can weigh them appropriately.

Each feature vector becomes a fingerprint that describes the user without revealing any personal information.

Unsupervised

In unsupervised models, algorithms create abstract clusters based on shared characteristics. If the user is described using anonymous feature vectors, then these clusters are in an abstract many-dimensional space, which prevents typecasting users into common stereotypes.

Techniques like deep learning take this a step further by automating feature extraction from raw data and creating fingerprints that are not describable.

Noise reduction techniques include creating small groups that are likely to behave consistently. For example, users belonging to the same nine-digit ZIP code (also referred to as zip+4) are commonly grouped together. Creating these groups makes prediction analysis more robust by being specific enough without dilution, and at the same time provides anonymity for users.

Most mobile devices have the processing power of small computers and can be used to implement binning. Using a distributed, map-reduce paradigm, the device can map personal information into abstract buckets and share the aggregates with the ecosystem to reduce operation.

Exploration or Exploitation

These techniques help protect user privacy in human form but requires careful design to prevent creation of filter bubbles (only seeing what you have liked before) and personal recommendations in bad taste (for example: “Are you overweight?” ads targeted at recently engaged women).

Learning Opportunities

Webinar

Jun

From Legacy to Launch-Ready: How Gainbridge Made Its Website a Marketing-Led Growth Engine

Join in to learn how a D2C annuity brand gave marketing full website ownership — without slowing down or risking compliance.

Webinar

Jun

The 5-Question CX Audit: Benchmark Your CX Operations for 2026

Built around insights from the 2026 CX Outsourcing Report, this live session puts the audit into practice.

Webinar

Jun

The Hidden Cost of Fragmented Customer Communication

Discover why growing businesses are rethinking the systems, workflows and communication habits shaping customer experience.

Webinar

Jun

How Modern Marketing Is Exposing the Limits of Legacy CMS

Why marketers are rethinking CMS workflows that slow publishing, personalization and campaign execution.

Webinar

Prove the significant result not only in soccer

Jul

Content Leaders Collective: Proving Content’s Business Impact

Join us as top content leaders look beyond the buzzwords to share how they actually prove ROI and scale what works.

Webinar

On demand

Content Strategy Leaders Live: Managing Risk, Compliance & AI in Financial Services

Learn how financial services leaders are modernizing content systems without disrupting trust, compliance or experience.

Watch Now

Webinar

Jun

From Legacy to Launch-Ready: How Gainbridge Made Its Website a Marketing-Led Growth Engine

Join in to learn how a D2C annuity brand gave marketing full website ownership — without slowing down or risking compliance.

Webinar

Jun

The 5-Question CX Audit: Benchmark Your CX Operations for 2026

Built around insights from the 2026 CX Outsourcing Report, this live session puts the audit into practice.

Webinar

Jun

The Hidden Cost of Fragmented Customer Communication

Discover why growing businesses are rethinking the systems, workflows and communication habits shaping customer experience.

Good design entails a balance between showing users new content (exploration) versus bombarding them with known material (exploitation). As with all businesses, careful curation of content is critical to maintaining user experience.

Systems that can handle scale and noise robustly automatically protect user privacy. Taking it one step further, and handling exploitation versus exploration efficiently creates rich user experiences while preventing algorithmic stereotyping.

Title image "Shy" (CC BY 2.0) by Tom Edgington

fa-solid fa-hand-paper Learn how you can join our contributor community.