cowboy on horseback with lit lamp in his hand
PHOTO: Priscilla Du Preez

Get a few search managers into a room and the topic of relevance management will dominate the discussion. Now imagine what it must have been like at the Haystack conference in April, when over a hundred people convened for two days of presentations and conversation focused entirely on relevance. To the best of my knowledge this is the first conference exclusively devoted to finding ways of improving search on site, rather than as an academic exchange of research. 

OpenSource Connections (members of The Search Network) organized the conference and deserve enormous credit for their vision and commitment to the search community. You can get a flavor of the event through Charlie Hull's, Sujit Pal's and John Berryman's blog posts, as well as by viewing the presentations.

A New Profession: Relevance Engineering

In his overview of the conference, Hull suggested that relevance engineering could be a new profession, someone who understands both the business and technical aspects of search relevance, work with a variety of underlying search engines and expertly use the correct tools for the job to drive a continuing process of search quality improvement. 

Although many of the conference presentations were focused on open source search applications, the techniques shared are invariably applicable to any search application. After all, Lucence/Solr is using the same BM25 model as any other current search application. The benefit of open source is it is much easier to get deep into the subterranean levels of the code, something commercial vendors would not want to facilitate.

The fundamental challenge is summed up by the title of a book by information scientist Tefko Saracevic, "The Notion of Relevance in Information Science: Everybody knows what relevance is. But, what is it really?" If you are an information scientist, this book is a very good introduction to the topic, but a far better place to start your career as a relevance manager is "Relevant Search," by Doug Turnbull and John Berryman of OpenSource Connections.

Related Article: The Scent of a Good Search

Relevance Is More Than Mathematics

Search Explained CEO Agnes Molnar regularly highlights the range of factors which can positively or negatively impact relevance engineering in SharePoint. In my view, two of the most important are content quality and metadata quality. To use an engineering metaphor, if you don’t have good quality materials and engineering drawings, the chances the bridge you are building will meet the customer's requirements are remote.

A common response I hear when I raise the importance of information quality is there is no way all corporate information can be brought up to a consistent high level of quality. I would agree this isn't feasible. Instead I recommend focussing on the content that holds the greatest importance for the business, where some remedial work and the establishment of quality standards would have significant benefits. One place to start is PowerPoint presentations, which often contain important insights concealed behind a "clever" title and a mass of holiday photos to brighten up the presentation. In my experience, just asking the author to make sure the date of the presentation is correct and that it includes some form of executive summary can make a substantial difference to findability and relevance scoring.

Related Article: Your Intranet Is Only as Good as Your Metadata

If You Invest Nothing, You Get Nothing Back

Realistically, you will need more than one relevance engineer. Although all engineers regard themselves as multi-disciplinary, think of the discipline-specific expertise a structural, electrical and aeronautical engineer has. 

Improving relevance starts with an intelligent review of query logs. This review attempts to answer why a group of people used a particular query term, which requires not only subject (or perhaps departmental) expertise but access to networks within the company to gain an understanding about the choice of query term. Some years ago I was curious why "Tracy" was the 30th most frequently queried term in a global business's search engine. Rather than Tracy being someone everyone wanted to know, my client said it was an acronym for an application widely used by its Japan operations that no one outside of the country could pronounce or spell.

Language skills come into this as well. Engineers work to different codes of practice in different countries (for example, imperial in the US versus the metric system almost everywhere else). Even though the volume of content in a secondary language may not be high, to anyone who speaks this language, being presented with highly relevant content is just as important as in English. All too often I find search teams struggling to support multi-lingual search. The additional vendor fees may be low, but the investment in language skills within the search team needs to be high.

Related Article: Enterprise Search: Getting Better, But Still Needs the Human Touch

Before You Get Excited, Just One Small Problem

Where are you going to find the relevance engineers you undoubtedly need?