smoke signals
It’s time we use context to determine intranet result relevancy PHOTO: Austin Neill

Have you ever had a “Google moment”?

I was driving to my neighborhood supermarket on a recent Saturday when I received a Google alert on my phone: “Safeway is five minutes away in light traffic.”

Let's be clear: I had not fired up Google Maps for directions. Google volunteered the information, without any explicit action on my part. 

Apparently Google had learned from months of observation that on Saturday afternoons, I — and presumably people like me — shop for groceries. And, again from observation, Google knows that I shop specifically at Safeway.

Ignore the flashbacks to George Orwell’s “1984.” What Google has done is simply amazing: it is the ultimate consumer of context.

Google can deliver accurate results not because it reads our minds, but because it does an incredible job collecting context from any source possible. Like Santa, Google “knows when you’ve been sleeping, it knows when you’re awake ….”

What Clues Do Your Intranets Reveal?

Has anyone ever reported a similar story about your enterprise (or web-based) search? I’m betting the answer is a resounding “no.”

It’s time we use context to determine intranet result relevancy. Your intranets have an incredible amount of context available for you to use. Your task is to identify the context that makes sense, and then use it for every query. 

In fact, I’d wager that you are already using at least one type of context: security level. Managers can probably see content that first-line employees are not able to view. 

What kind of context might be helpful? The most common will include people, places, queries and viewed documents. 

If a user is logger in, you can identify their job title, home office and current location, co-workers, managers and people who report to them, their seniority, a history of previous queries and documents viewed, current and past accounts, suppliers and employers and more. 

You also can tell which public and intranet sites your user visits, including which internal sites they visit. 

Think of how you could use this information as a clue for query understanding.

How to Read the Signals

Companies are increasingly recognizing the importance of reading into real-time actions for the clues they provide. And while intranet logs may not fit the classic big data model, the analytics for real-time use do. 

Spark — with or without Hadoop — is more appropriate for near real-time analytics than Hadoop. Unlike Hadoop and Map Reduce, which process data linearly, Spark processes full datasets at once, which is much better suited for these "streaming" data points. 

When combined with machine-learning tools like Mahout or Apache MLlib, you can automate all of these complex steps into a relatively simple high performance tool.

Companies like Elasticsearch, which just acquired behavioral analytics company Prelert, and Lucidworks, which ships Spark with it’s Fusion platform, are beginning to understand the importance of these signals — or as Lucidworks calls them, "event signals." 

What's a signal? It depends: what do you want it to be? 

Separating the Important Signals From the Unimportant

Signals are typically events triggered by an action, but they may be triggered by other data about the person searching. 

Viewing a product is a signal for ecommerce. The site uses it as a clue that you may be interested in similar products. Purchasing the item is a stronger signal: not only are you interested in the product, you spent money to acquire it. And you'll like see more products similar or related to your purchase in the future. 

On an intranet, signals may be based on behavior and queries, as they are on the ecommerce site, but there’s a good chance the metadata about you — your office location or role — may be at least as important.  

Not all signals, even strong ones, are valid. But until and unless an intranet search user leaves or changes roles, those signals will be helpful.

Perhaps a current employee once worked at a company that is now a sale prospect: would this be helpful? Employees in a remote field office tend to look at a subset of HR forms: could you use this information to boost relevance for these forms for other remote office queries? 

The real question is how do you know which signals are important? That’s another “it depends” question: as with so many things, hindsight will be much better than foresight. Still, it’s the first step to learning. And soon, you may be able to suggest “People like you also searched for ...” for your intranet users.

With the tools available today, it’s far easier to consider all of the context you can access. Over time, as you gather this context, you can decide which signals are the important ones. Roll those back into your search index, and you’re on the way to better results and more satisfied users.