Google's announcement earlier this year that it would be retiring its popular Google Search Appliance has powered a renaissance in the search industry.
The atmosphere at the recent Enterprise Search & Discovery Conference in the District of Columbia felt like the old days of search, with several new and established search product vendors showing their wares.
The common thread? “Use our product to replace your GSA servers.” There were even consulting firms of good repute combining tools developed over the years for clients, packaged together as an alternative to the GSA.
It was 2001 all over again.
So for those facing the looming GSA deadline (between 400 and 700 days, depending on your license) let’s take a look at what it will take to replace your current search and next steps to start the process.
How GSA Turned Search On Its Head
It used to be that enterprise search was licensed as software. You’d pick a couple of vendor solutions, evaluate them on your hardware, choose one to license and roll it out.
Partially due to poor evaluations and implementations, and partially due to mergers and acquisitions in the industry, most companies would repeat the process every few years.
Google had a different idea: It created a feature-rich search platform and licensed it with the hardware. Unlike many software solutions of the day, the GSA had a full GUI-based administrative console to set up, control, customize and manage search while its competitors still used easy to mess up configuration files. And the GSA even included best bets and reasonably good reporting — a rarity in many of its competitors.
What Separated the Search Appliance From the Competition?
In the GSA, Google bundled several components that made search easy to install, convenient to manage and, because of its flexibility, capable of ingesting all sorts of content.
A few of the elements that contributed to the GSA’s success included a system management console, a single crawler interface for web and other content types, document-level security, easy-to-use relevance tools and activity reporting.
With all of these capabilities, the GSA set the pace for what it meant to have "enterprise search."
Not every search environment was a great fit — some had much more complex requirements — but the GSA solved most of the problems for most companies. Virtually every corporation we worked with after 2010 had a GSA at least for the public-facing website.
Several GSA competitors had incorporated some — even many — of the same capabilities. But the GSA had one important feature which the others lacked: the “Powered by Google” logo.
Betting on a Brand
A few years ago, a leading GSA competitor hired a market research company to compare its product with the GSA. Test subjects were asked to accomplish tasks on two different systems to identify which one had better results. The content indexed was the same on both systems, but one system showed a “Powered by Google” logo, while the other had the competitor’s.
At the end, a large majority of the test subjects reported that the “Powered by Google” system had far better results than the other system.
The same software powered both systems. But the Google logo was sufficient to convince 80 percent of the test subjects that the results were better.
What Do You Need From a GSA Replacement?
Before you can replace your GSA servers, you need to decide for your organization what it needs from a replacement.
Here are some things you need to consider:
The GSA is a hardware and software solution. If it’s important to you that your replacement search product comes installed on its own hardware, that narrows your selection.
On the other hand, when your GSA license stops working, you’ll have a powerful yellow Dell server which may well work well with the search software that meets your other requirements.
A Powerful Crawler
Before search algorithms become useful, you need to be able to ingest all of your content. While the term crawler originally applied to web content, most modern crawlers can ingest content from just about any repository, depending on the connectors (see below).
Filters and Connectors
For indexing and for retrieval, search needs document filters to convert the document format into a stream of text. This is true whether your content is in HTML, Microsoft Word or PDF. Connectors provide the ability to read documents from file systems, relational databases or content management systems like SharePoint.
If you only need to provide search for your public website, it’s relatively easy. But if, like most enterprises, you have several levels of document-based security, the requirements are tougher.
Regardless of whether you use LDAP, Active Directory or some other form of document level security, your search platform has to be able to access the documents in order to index them. And of course the search platform has to ‘trim’ results at search time, so searchers see only the content to which they have rights.
Duplicate and Near Duplicate Detection
Sometimes on the internet you'll find millions of pages that can answer the same question. Some are even virtually identical.
Intranet content queries tend to have a "right answer," but that may be hard to discern as any given document may have multiple, different versions. When you look for a contract template, you need to be sure your search returns the right one. Some platforms do this well, some not so well.
System Management and Reporting Console
Search that uses XML — or even text — control files is awkward and invites mistakes. Your new search should let you perform everything you need to do more than once from a graphical interface, whether browser based or application driven.
You have between 400 and 700 days — how should you proceed?
First, get started now: the time will pass very quickly, and the GSA end of life is certain. Identify what you need, and find companies that can help you through the process.
- Decide if you can use a software solution, or if you need a hardware/software appliance
- Identify other criteria including data sources, document formats and security requirements
- Identify and interview likely vendors whose product meets your needs
- Determine what will define ‘success’ in the evaluation
- Narrow down the list to two or three vendors for a POC on your live content
- Install and test the selected candidates on your live content. Get staff, power users and administrators involved in the evaluation
- Select a preferred candidate and negotiate pricing and other terms
- Run the replacement system concurrently with the existing system so users can access both. Gather feedback
- Acquire rights to the new system, install it and roll it into production
Remember, your new solution will be different from your existing GSA, both in how it accepts queries and in how it determines relevance. You and your users should factor in some learning and acclimation time for a new system of this magnitude.
And if it helps, maybe you can just keep that "Powered by Google” logo.