Customer Experience Management (CXM), Information Management, Social Business
 
 
 

Lucene with Solr - Now the De Facto CMS Search Engine

Lucene-leader.jpgHaving content that no one can find, not even your own users, is useless. Every Web CMS project and company has to face this issue at some point. What to do about search?

These days, many come to exactly the same answer.

Strategic Choices

When a project team sits down to design what they're going to build or implement, they have to look at each area and decide where to reinvent the wheel and where to stand on the shoulders of giants. Which decision is right for that project depends on many different variables, such as how much time they have, what kind of budget if any, how many people they have to do the work, and their target market.

Search is one of those big areas where time, budget, and work hours have to be balanced.

Lucene Not Just for Web CMS

Projects like Alfresco (news, site) knew for a fact that full-text search would be a cornerstone of their offering. Paul Holmes-Higgins, Alfresco's VP of Engineering, says that five years ago they also knew they had to get something out as quickly as possible. So, they started looking into what technologies already existed that would fit well into their planned top to bottom Java stack.

Important use cases were fleshed out. Both proprietary and open source options were tested, with the thought that they could always acquire a closed source engine and then open source it. Ultimately they felt they really only had one option, and that was Lucene (news, site), especially since its license was so flexible.

Of course, Lucene wasn't perfect. There were some options it didn't offer yet that they needed. So, rather than spending work hours on building an entire search solution from scratch, they extended Lucene and added features such as being able to have the search server be relaxed about transactions when indexing a site, rather than having to remain in lockstep. Such a feature was important for scalability.

When asked what he would choose today, he says that the answer would be the same. Lucene is so dominant that no one has created a viable alternative. Only specialized academic search projects are happening in that space. Though he also confesses that in their five years of existence, they've never considered switching. They just haven't needed to.

Many Vendors Reaping Apache Benefits

Grant Ingersoll, a member of Lucid Imagination's (news, site) founding technical team and a Lucene committer says that Drupal (news, site) and TYPO3 (news, site) also use Lucene, among many more, some in the form of Apache Jackrabbit (news, site).

In his own (unbiased, of course) opinion of why Lucene has become so dominant, he feels the important factors are that:

  • It's very stable
  • It has a proven search model
  • You can roll it out quickly
  • The APIs are easy to use, and made even easier with Solr
  • It has a strong and active community

Free is of course good as well, but Ingersoll says he's found that flexibility and being "white box" have turned out to be more important to those choosing Lucene instead of rolling their own or finding something else.

 

Continue reading this article:

 
 
Useful article?
  Email It      

Related Articles:
Tags: , , , , , , , , , , ,
 
 
 

Featured Events  View all | Add event | feed RSS

Who's Hiring?  View all | Post a job | feed RSS


 
Are you hiring?    Post your job today ($45 for 45 days)!