Faceted search, or guided navigation, has become the de facto standard for e-commerce and product-related websites. And e-commerce sites aren’t the only ones joining the club: other content-heavy sites such as media publishers, libraries and non-profits are tapping into faces to create an optimal search experience. Faceted search is also hitting its stride behind the firewall, being found in intranets and enterprise search applications.
But what do you do when you don’t have the budget to get one of those fancy commercial search applications from the large specialized vendors, like Endeca or FAST?
This is the situation I recently found myself in on a project with an international organization. We were trying to improve their online document search and quickly realized that facets were the way to go. We just didn’t have the resources to consider any of the usual vendors that I might typically recommend to an e-retailer.
Some excellent open-source options have matured in recent years, now powering some of the best-in-show faceted search applications out there.
The most popular open-source faceted search platform is Apache Solr, built off Lucene -- a full text search library also from Apache. Solr includes a simple faceting toolkit to create a basic faceted search experience but it can also be easily extended to implement more advanced interfaces. All you need is a savvy IT department that is comfortable getting its hands a bit dirty. You can also hire consultants to do the dirty work for you, often at a lower cost than you would pay for a commercial enterprise search license and maintenance fee.
Examples of organizations using Solr for faceting include the new FCC.gov site, the Smithsonian, Netflix and CNET. (Full list here.)
Sphinx is another open-source faceting option, optimally designed for indexing database content. It’s known for its scalability and real-time indexing (it powers Craigslist, at over 50 million queries/day). Bloggers with firsthand Solr and Sphinx experience do mention, however, that facet support is a bit less “out of the box,” taking more effort to achieve. You can find its list of “powered by” sites here.
Both offer commercial support, but if you’re looking to invest time in either application make sure to compare their features and quirks before choosing.
Drupal Open-Source Content Management System
Another option, if you’re starting a new website project, is to use the Drupal open-source content management system. There are a few different approaches, including some add-on modules to the CMS itself (a few are now deprecated, however), as well as a search API module that allows you to apply faceting. The search module can also be used with a variety of backend search engines (including Solr).
Still Tread Carefully with Ease and Cost of Using Open-Source
So if you’re dreaming of a faceted search interface (aren’t we all?) but don’t have the funding for an out-of-the-box solution from the large vendors, consider an open-source solution. Keep in mind that open-source does not automatically equal easy or cheap: if you have complex requirements and end up hiring integrators to implement your solution, you still might end up with a somewhat pricey project. But open-source solutions do have large communities behind them, so development can be faster and less expensive than vendor professional services.
Finally, as a taxonomist it’s my civic duty to remind you that faceted search is fueled by taxonomy and metadata, so if you don’t have either of those things well in hand, reserve some budget to create the necessary information structures the search will require.
Editor's Note: You may also be interested in reading: