Enterprise search helps employees find the information they need, confident in the knowledge that all potential repositories have been indexed. In an ideal world, queries could be entered in a single search box to generate all relevant results from across all information resources of the organization. These results would be presented in a visually consistent ranked order through a federated search application.
But we don't live in an ideal world. If federated search was easy, why would Google offer Google Scholar? Federated search presents a number of challenges and there are a few ways to approach the challenges. Let's look at a few of them.
Option A: One Big Index
In principle, it's possible to crawl and index a number of individual search or business applications with a search component, and create a single index. That is not difficult. What is difficult is creating a ranked list of results that make any sense to the user, which includes presenting them in a consistent way.
Result lists will likely be quite long, given the size of the index, and delivering precise results is difficult. It's important to include disaster recovery planning in this option as there are many points of failure. Manage crawl schedules with care -- one schedule will not meet all requirements.
Option B: Query Federation
One search application manages the query, which sends the query out to other search applications. Results are then either integrated, or, more usually, presented in a number of different sections of the search results page. A ranked list of results doesn't make sense in this case as ranking cannot be calculated as an algorithm across each of the individual applications.
This option requires the use of connectors -- software which both converts queries into a readable format for each application and returns results to the master search application.
Connectors are challenging to write and maintain. A small change in configuration in one of the queried applications may disconnect the connector. Commercial search vendors, such as BA-Insight and Coveo have libraries of connectors that they maintain, but connectors can also be obtained from systems integrators such as Search Technologies.
When a connector between two search applications fails (though often they just fail to perform as expected), it results in an interesting discussion between the vendors concerned about which end failed. Connectors can also manage security protocols, either through early or late binding. Matching the security models in each application will introduce some latency into the delivery of results, and this needs to be carefully managed.
This approach works well when it is possible to query a search application (for example from a commercial publisher), but not crawl and index it.
Managing Federated Search
In both options users should be able to query individual search applications. Make them aware that using federated search may bring about changes in ranking and it may not be possible to implement query suggestions, or offer the same set of filters and facets to manage long results lists.
Another challenge with federated search is making sense of the search logs, especially in the case of Option B. Adding a new (or even upgraded) search application to the list of searches or repositories in either option can make such substantial changes to the ranking of results that users should be forgiven for thinking that search is broken. A final challenge is how to cope with cloud-based applications, such as searching both on-premises SharePoint and Google Apps in the cloud.
Option A, Option B or Option C?
Option C is to not federate, but improve the search experience in other ways. Some search applications offer both Option A or Option B, including SharePoint 2013. Arguably the FS4SP option in SharePoint 2010 was an even better federated search application because of the power of the content processing pipeline, which largely disappeared in SP2013.
I write this column sitting on a fence: I have seen good examples of federated search as well as appalling examples. The devil is in the detail. There is no substitute for a prolonged period of both requirements gathering and proof-of-concept testing. An increasing number of commercial vendors offer some form of federated search but take the time to read the small print, and at the end of each sentence write a short essay on "what the implications are for us." You could build a federated application in open source software, but this option is presently only for the seriously brave and experienced search teams.