close up of raised hands, people waiting to ask a question in a lecture
PHOTO: Artem Maltsev

Enterprise search users assess the initial performance of a search application by working their way through Search Engine Results Pages (SERPs). Once they have a sense of what direction the application takes, they have two options: The first is to apply filters and facets. These two different approaches to reducing result sets are often confused. One common filter, for example, is "file type" — but this is like looking at books in a bookstore and only looking at folio versions and not paperbacks. 

Repeat after me: it's all about content, not format.

Related Article: When Improving Search Performance, Don't Follow the Clicks

Refining Search Results with Query Expansion

The second option is to expand or revise the query, which is a great deal more difficult. It's a common perception that enterprise search queries come in around the 1.6 word mark. In my view, the reason for this is that query expansion is so difficult that users adopt the filter/facet route and hope against hope that they end up with relevant documents. Search guru Daniel Tunkelang suggests that query expansion terms are typically abbreviations or synonyms. On this, I must disagree, as it is far from the case in my experience.

Helene Hembrooke and her colleagues published a very important piece of research published in 2005 that categorized nine different ways in which queries could be expanded. One of these colleagues was Elizabeth Liddy, who is one of the most respected research leaders in search interaction, so you can be certain the analysis is well founded.

The nine approaches are: 

  1. Elaboration
  2. Redundancy
  3. Broadening
  4. Refining
  5. Backtracking
  6. Plural making/taking
  7. Kitchen sink
  8. Poke-n-Hope
  9. Topic terms

Each of these works well in different situations but deciding which to use requires training in all nine. Search is not intuitive, and this is most apparent in query expansion. A search user needs a reasonable command of the specialized vocabulary of the topic they are searching, and yet they are searching for this topic because they have limited knowledge and need to know more. This is a difficult position to navigate.

The Language Factor

The challenge becomes that much more difficult in multinational companies. In these instances, users might realize they need to expand a query but do not have the language skills to do so with confidence. This is especially a problem with what are called cognates, words which look and or are pronounced alike in two languages. For example French and English have hundreds of cognates, including true (similar meanings), false (different meanings), and semi-false (some similar and some different meanings). Think of the challenges this poses for a search user deciding which related words they can use to expand a query. Another language factor is whether the author of a document actually used the "correct" term.

Related Article: Searching for Information in the Tower of Babel

Understanding the Magic of Search

Next we need to consider what the search application is doing to phrase searching. When you search Google for [high-density polyethylene] you will find that using [high density polyethylene] will give you 30% more results. You may wish to search for information about aluminium bronze alloys, but when I searched Google while writing this column I had different result sets for [aluminium bronze] [aluminium bronze alloys] and [aluminium AND bronze AND alloys]. 

Every search application has its own particular way of dealing with search queries with more than two words. Without some sense of what magic dust has been applied by the application, query expansion grows difficult. You cannot count on the engine having the same semantic view of the expanded query as you do.

2dSearch: A New Approach to Structured Searches

The 2dSearch application is a major advance in managing structured searches from Tony Russell-Rose and his colleagues. Instead of entering Boolean strings into one-dimensional search boxes, queries are formulated by combining objects on a two-dimensional canvas. This eliminates syntax errors, makes the query semantics more transparent, and offers a more effective way to optimize, save and share search strategies. The screenshot below illustrates the approach more effectively than I can with words.

2Dsearch screenshot of search options

An important feature of the application is you can manage search strategies as independent, executable objects. Although in theory you can manage search variations using the back button, this assumes the session is the same. In the enterprise a complex, structured search could be undertaken over many sessions, which is yet another reason why click logs may not be telling the entire story. At present 2dSearch can be used to search a number of publicly available collections (e.g. PubMed, Google Scholar, etc.) but I would not be surprised to find it deployed in enterprise applications before too long.

Related Article: The Essential Glue for Your Digital Workplace

Improve Search Literacy

When recall is important in a search session query, expansion is a crucial element of search success and satisfaction. Deciding on which of the nine approaches is the best starting point will not be intuitive. At a minimum, set out the options with relevant illustrations in the help pages of your search application.