kicking space invaders graffiti

By emulating common web SEO tactics, you can improve the quality of your enterprise search results.

To start, you'll need to know the search terms your users enter on your site. Then you reverse engineer the intent behind that search to understand what the searcher was looking for.

Once you have the top queries and their synonyms, you can start improving your results.

Identifying 'Best Answers'

Start small: focus on the top 10 queries, which typically will account for 25 percent of your searches. In fact, start with the top queries you have identified using the methodology described in my previous article.

Given the query, its synonyms, and the user intent you’ve identified, work with content owners and subject matter experts in your organization to identify the most likely "best answer" for the query.

If your search engine has a way to define best bets, you’ve got it easy: simply specify the top result for the term as the best bet.

If, like most search platforms, your search platform doesn’t provide a way to specify best bets for specific queries, you need to game your search engine.

Let’s call the top result for a search the "current top," and the document you've identified as the best bet the "likely candidate." Your task now is to identify what the current top contains that causes it to rank about your likely candidate.

Most search platforms calculate relevance using a technique called “TF/IDF,” which stands for “term frequent/inverse document frequency.” What that roughly translates to is that a word that occurs frequently in a small document will rank higher for a given query than a large document with the same number of instances of the word. You can find an excellent detailed description of TF/IDF here.

Compare your current top document to your candidate document. Overload your candidate document with the query term and synonyms. You can do this without changing the actual text of the document by inserting the term and its synonyms multiple times in the document's "properties" fields. I like to use the "keywords" field, but use what seems right to you.

Promoting to the Top

Once you have updated the metadata field with the term and its synonyms and re-indexed your content, there’s a reasonable chance that the candidate document will now appear at the top, or at least in the first few results. If you can live with the new placement, then you’re ready to move on to the remaining top queries.

If that didn't work, you may need to go back and overload the document with even more instances and synonyms. But, as you can imagine, that soon becomes frustrating. There is a better solution.

Since you added the term to a field — for example, the Keyword field — you may be able to use a trick that search experts call "query cooking" — that is, expanding the user query prior to submitting it to the search engine. This requires some reworking and lightweight programming of what happens when a user clicks the search button. It’s not rocket science, but you may need to enlist the help of your IT department.

What you want to do is expand the basic query to include documents where the Keyword field also has the term. Notice you don’t want to require the term in the Keyword field, or that would be the only result. Research your platform to see how to bias documents with the search term in the Keyword field, and rank it higher than documents that simply contain the term.

Possible Workaround

A quick tip that may work for you, depending on your search platform: You may be able to use a unique metadata term to signal a top result and bypass the use of page-specific and meaningful synonyms.

Pick a unique term that you will use across all of your content to indicate a best bet. For now, let’s assume you’ve elected to use the term "tophit." (In reality, you can pick any term not in use on your site — it helps if it’s not an actual word.)

As you go through the process of identifying your best answers, tag those documents with the meaningless term.

At search time, expand the user query so that "tophit" is included with every query. Note you do not want to "AND" the two terms, or you will get no other results. 

Ideally, this will work, because your actual top result and the candidate page are no doubt similar; but the special term is only in the top result page — effectively forcing a best bet without the effort.

Lather, Rinse, Repeat

Once your top document shows for the most frequent query, move on to the next most frequent and repeat the process. Continue until you have the top 10 documents producing good results.

Remember, improving search results is Sisyphean task: your content changes, and your users come up with ever more creative ways of asking questions. You have to be in it for the long run: but with these simple methods, you and your users can win.

Title image CC BY-SA 2.0 by  feverblue