I spent much of the recent holiday weekend doing the traditional sorts of things: parties with friends, cookouts, housecleaning… Things tend to pile up, especially since I work out of my house and have stacks of books and manuals going back to the days of Bush number 1.
One of the gems in the trove was a two-foot-high stack of Verity K2 manuals, dating back nearly 25 years, including the once-useful Verity Query Language manual. VQL was then, and still remains, among the richest stores of query commands. Who can forget the PARAGRAPH, SENTENCE and NEAR/ commands, not to mention the still unique and powerful ACCRUE? In the hands of expert search specialists, Verity was a strong tool — likely still in use today in the dark recesses of intelligence communities everywhere.
That, and a recent opportunity to experiment with the Elasticsearch query syntax, got me thinking: With all of these powerful query tools, and with search being mission-critical to so many organizations, why is “search sucks” a topic with more than 100 million hits on Google? We have the tools and the technology — but the phrase remains a common complaint.
One clue to the answer comes from the annual Findwise Search and Findability survey. When asked, respondents report that search is, in fact, “mission critical” – but when asked how many full-time staff are dedicated to search, the answer hovers around “one” year after year. Chances are you have more people working on your SEO for Google than on your own search implementation!
“But wait,” you say. “Google does search so well: Why can’t my intranet search be as good as Google?” Companies often purchase the Google Search Appliance (GSA) to power their web and intranet search. It’s a big yellow box, has an easy administrator console, and the results have the cute, colorful “Powered by Google” logo.
The Google Search Appliance is a reasonably useful option for adding a search capability to your internal or external site – but the GSA is no Google.
How does Google on the web do so well? It may be hard to imagine, but Google is not a search platform: It’s a big-data application. With upwards of 50,000 queries per second, Google is a big-data analytics company, gathering and executing queries, showing results and studying user behavior after the search. What document did most users click after a query? Were there subsequent queries that could be correlated with the first query? And all of that data goes back into the next query, so over time, results get better. Google is a learning big-data application, evolving with each query each second.
Search Doesn't Happen Out of the Box
So, what’s the answer for your search?
The lesson here is that search doesn’t just happen out of the box. It takes work and constant attention. You don’t have anywhere near the query volume Google has, but you do have the capability to improve your search and results day-to-day. How?
First, push for the staffing you need, staffing appropriate to how critical search really is for your company. Pull together use cases. Corporate risk management can be an ally. Your search logs could be used in legal proceedings to show sexual harassment, product liability or other legal actions. Content owners can also be your partners. They certainly want their content to show up, so help them understand what it takes.
You’ll also find tools to automatically extract and add metadata. SmartLogic, BA Insight, Expert Systems and others have automated tools. There are also open-source entity extraction tools to help extract meaningful metadata. Use them.
Then staff your search team to monitor search. This takes more than weekly search stats, and it will take tools to tune your relevance. Some platforms provide automated “popularity” boosts, but you need to do more than just boost documents that get clicks – it’s possible that a document with a promising summary is not the right answer.
Finally, push your learning back into your platform. This might be as simple as adding metadata to your content. It might take some tools for best bets or better query tuning. But none of this can be done without monitoring your search. Allocate a reasonable budget and staff. And seek professional help.