wood owl, possibly Strix, sitting on snowy branch

Assess Search Performance With Search Tasks

4 minute read
Martin White avatar
Evaluating search performance through the lens of productivity is ridiculous. Rather, define search tasks, then see if the task can be completed satisfactorily.

Most enterprise applications (CRM, ERP, HR, Finance) are fundamentally there to provide a means to amend and query a database. They support well-defined business tasks (e.g., "add a new employee") and because of this, it's possible to measure the time it takes to perform a task and establish a productivity metric. Moreover, when they fail (e.g., the new employee record is missing some fields) the failure is obvious to the person using the application.

Enterprise search is different. The focus is usually on tracking queries. Users have no idea if the application is working correctly or not. Trying to assess productivity in search is a ridiculous approach to search evaluation (just as it is in the performance of Office 365 and, for that matter, an intranet). The solution lies in defining search tasks and considering the extent to which the task has been completed satisfactorily.

Research Into Search Tasks Continues

Over the last few years, a substantial amount of academic research has been carried out on search task definitions, notably by Professors Kalervo Jarvelin at University of Tampere, E.G. Toms at the University of Sheffield and Pia Borlund at Oslo Metropolitan University. 

Last month Borlund gave the Strix lecture in London, which turned out to be a master class in good practice for search task development. Much of the research involved using students as test participants, but Borlund raised the question about the validity of doing so, as students do not represent a homogeneous group that can then be scaled and extended to other situations. Indeed she raised the question about how much of the published research is valid, given the challenges of defining user groups and associated search tasks. She was especially critical of the type of test where students were asked (as an example) to imagine they were a professor starting out on a new area of research when they would have had no experience of doing so.

Related Article: Search Won't Improve Until We Understand Why People Search, Not Just How

Developing Representative Search Tasks

Borlund recommended making simulated search tasks as realistic as possible, taking into account the roles and expertise of the test participants. The tasks need to be sufficiently realistic so the users know what to search for and how to assess the relevance and value of the information they find without guidance. Crucially, the content must exist! Don't create a search task that asks participant to find weekly reports on product sales by country when only monthly reports are compiled.

Pilot testing each task is vitally important. Feedback from participants may well show the task is not grounded in reality, or a more common task ought to be added to the research program. Pilot test participants should be excluded from subsequent tests as they may have developed biases about what they expect to find.

Learning Opportunities

Related Article: Diagnosing Enterprise Search Failures

Service Blueprinting

Borlund did not cover the concept and practice of service blueprinting in her lecture, but in my opinion the development of search tasks should probably be an outcome of a service blueprinting exercise. The technique dates back to work G. Lynn Shostack conducted in 1984 but it is only quite recently that it has become widely used. Nielsen Norman Group provides a good summary of the technique and the Interaction Design Foundation has assembled a collection of articles on it.

Assessing Search Performance

Evaluating the success of a set of search tasks provides a benchmark for search performance and an assessment of search enhancement. Introducing on-site or remote usability testing for search tasks could provide valuable information about the different ways people undertake the task as there is rarely just one way to carry out a search task. Did the participants put in a short query and then filter the results or did they develop a more complex Boolean-type query because they were searching for familiar content? I'll also add that using search tasks will also be of value in the development of digital assistants.

Related Article: What Do We Mean By 'Search'?

The Value of Academic Research

The sub-theme of this column is the considerable value the academic research in this field can provide. Research into information retrieval is so often about the underlying algorithms. And while it is rare for an academic to refer to "search" as such, the research domain of "interactive information retrieval" is very much about search (as readers of this column will know). Yes, it can be hard to track down this research, but that's no excuse for ignoring it. The work of Professor Borlund described above is a very good example of applied research with direct relevance to search managers in intranet, ecommerce and enterprise search responsibilities.

About the author

Martin White

Martin White is Managing Director of Intranet Focus, Ltd. and is based in Horsham, UK. An information scientist by profession, he has been involved in information retrieval and search for nearly four decades as a consultant, author and columnist.

About CMSWire

For nearly two decades CMSWire, produced by Simpler Media Group, has been the world's leading community of customer experience professionals.


Today the CMSWire community consists of over 5 million influential customer experience, digital experience and customer service leaders, the majority of whom are based in North America and employed by medium to large organizations. Our sister community, Reworked gathers the world's leading employee experience and digital workplace professionals.

Join the Community

Get the CMSWire Mobile App

Download App Store
Download google play