Back in the day, enterprise search had it easy. Before Google, enterprise search gave people a way to find content on the corporate intranet, and it was amazing.
“The Acme contract? Just search, it will show up.”
Users were happy just to find a copy of the contract, even if it wasn’t the final version or on the first page of results. But along came Google and other online search engines, and that pretty much ruined it for enterprise search.
What Does Google Have That Enterprise Search Doesn't?
So why is web search so great while enterprise search still struggles?
In my experience, poor data quality almost always contributes, if not is fully responsible for the problem.
Part of the answer is pure volume of activity. If thousands of people perform a specific query, it’s possible to use information on which results were viewed as a "signal." When the signal is fed back into the search engine, results improve over time.
Search that combines the pure volume of query activity with ‘big data’ tools enables machine learning software like Apache Spark to show reasonably good results that improve over time. But the tools don't create the magic here: it’s the volume of data the tools can process. The more data, the better the results.
Follow the Search Trail
What data are we talking about here?
When it comes to search, it starts with the query. But it also includes the action a user takes after receiving the list of search results. For example: what documents did the user view? What position in the result list was the first result the user viewed? How long did it take the user to come back and look at other results or even start a new query? And what was the next query and the user behavior after that?
This type of processing is what enables sites like Google and Amazon to claim that “people like you” found a page useful or a product interesting. All of the tools needed to gather this data are available in enterprise search products today — although some platforms require more assembly than others.
Here again we return to the pure volume of activity. Google and Amazon, which track everything users do on their sites, can reliably claim “people like you” like a product or result because they know thousands — or millions — of people who searched what you searched for, and then viewed the same results you did will generally view a page or buy a given product.
Enterprise search lacks the volume to deliver the same results. Hence the perennial question: “Why can’t our search be like Google?”
But Google has another advantage over enterprise search: data quality. Companies often employ teams of people to manage their presence on Google and other platforms. It even spawned its own industry: “Search engine optimization.”
Make Metadata Work For You
The good news is enterprise search should be easy. Think about it: when you create content, do you think about ‘tricking” your enterprise search platform with inaccurate metadata? Probably not.
Nonetheless, most intranet content suffers from poor metadata.
So, what can you do to help insure your search platform is more useful and has better content?
First, when you create content, make sure every piece of metadata adds value. For example, look at the file and directory names where you save files. Saving the “Acme proposal” in the “Bravo Steamworks” drive folder won’t help find it again.
Next, verify your name appears in the Author property field. Not that of an associate or the person whose laptop you inherited when he got fired.
If you want to create your own fields to your documents — and your organization uses Microsoft Office products — you’ll find a little-known capability under the ‘Properties’ menu. Properties displays a list with a number of additional fields that search platforms generally index.
But Office documents also allows you to define custom fields. If you want to use fields like ‘Region’ or ‘Vendors’ in your documents to make it easier to find, simply add those fields as a Custom Property and assign it a value. And if your default template document already uses custom fields, you’re already good to go!
Garbage In, Garbage Out
The old adage ‘garbage in, garbage out’ applies to this day, particularly in enterprise search. If you create content that you hope to reliably find again, play your role to insure your search works right without any guessing games: use the metadata!