Martin White raised some important points in his recent CMSWire article on high quality enterprise search.
Users rightfully expect highly relevant results from enterprise search — an expectation shaped by their experiences with internet and e-commerce search such as Google, Bing, and Amazon.
Although there are vast differences between internet and e-commerce search as compared to enterprise search, there are nonetheless lessons and processes that can be applied in each case to achieve search accuracy.
Measuring Search Accuracy
Internet and e-commerce search have well-established definitions for accuracy. However, these are more limited use cases.
Like White, I agree it is time to have a definition and processes for enterprise search accuracy that includes, “a definitive metric against which performance is achieved.”After all,you have to be able to measure it to improve it.
Without objective methods for measuring search accuracy, you cannot measure the accuracy of a search engine over time, nor can you compare the accuracy of two search engine instances.
Search Engine Scoring Offers Objective Measurement
It may be true that no enterprise search engine vendor provides a measurable process. However, search engineering consulting vendors have been providing it for some time now.
Paul Nelson, a search industry pioneer and chief architect at the company where I work, Search Technologies, recently presented a webinar that describes Search Engine Scoring, a process for “an objective, statistically valid measurement of search accuracy.”
Success, he concludes, requires this process to be automated so it is easy, fast and economical to run whenever needed. However, it should include the correct application of manual processes to be an accurate measure.
Search Engine Scoring facilitates the continuous cycle of improvement required for dynamic enterprise search solutions.
Search Engine Scoring Focuses on the User
White stated in his article that “users' attitudes towards information quality are usually never taken into consideration in search evaluation.”
It’s true that many prior attempts to measure search accuracy have been from a content and query perspective, not from the point-of-view of the users.
When users complain about search quality, search engineering teams often investigate the problem by looking at accuracy metrics from a query perspective. They ask questions like: “What queries worked? What queries are most frequently executed? What queries returned zero results?” and so on.
But, Search Engine Scoring takes a user approach, providing questions such as “Is the user satisfied?” and “Are the results worthy of further user action?”
This is the innovative direction search consulting firms are currently taking.
The question is how to assess user attitudes towards search results. We can certainly ask them, however, this is not efficient and what they say may not be as instructive as what they do. The analysis of user behavior from click logs to acquire this information is becoming more prevalent.
Establish Rules at Content Ingestion
In his article, White points out how dependent the quality of search is on the quality of the content being indexed.
The lament — "Why can’t enterprise search be as good as internet search" — comes from a lack of understanding of how content is created for each of these environments.
Enterprise content is written without regard for search relevance, while internet content is highly hand tuned for search (Internet SEO).
White’s statement that “No matter what rules you put in place for the production of quality content, they will always be very difficult (if not impossible) to apply to millions of legacy documents” may be true. But I contend that for enterprise search the rules need not to be at content creation, but at content ingestion.
Addressing Practicalities of Enterprise Search
Some search engine vendors have the ability to include content processing logic in their ingestion process, before actual indexing to apply rules. But often this functionality is too weak or inflexible.
Standalone content processing frameworks and custom code can be applied before indexing to dramatically improve content quality, by normalizing and cleansing content along with enhancing metadata to provide content quality.
I understand the attraction of promoted content. However, I believe organizations are unlikely to invest in the human capital to support it.
Fortunately, the lessons and similar processes developed for internet and e-commerce search using analytic techniques of user behavior to automatically promote content can be applied. This can even be done at the individual level, taking personalization signals into consideration for relevancy.
In conclusion, while I certainly agree that achieving search quality is very challenging, I contend that practical solutions to address the challenges are available and improving daily.