- 5 Things to Consider when Integrating your Content Management System and Portal
(4 comments) - Google Opens Marketplace for Google Apps: Box.net, Zoho on Board
(2 comments) - Google's Marketplace Spells Trouble for Microsoft
(4 comments) - CMS Review: Oracle Universal Content Management (UCM)
- Installing SharePoint 2010 on Windows 7
(5 comments) - Sun Microsystems Chief Open Source Officer Leaves Oracle
- How Document Management Has Evolved in SharePoint 2010
(3 comments)
Lucene Finds its Way to the Top
The Apache Software Foundation (ASF) has recently reclassified the Lucene search engine project from a Jakarta sub-project to a top-level ASF effort.
Lucene is a full text search engine that provides an API and a set of libraries enabling powerful search functionality to be included in all types of Java applications. Doug Cutting is the project's primary developer.
Lucene is offered as a developer toolkit, and requires a certain amount of Java development to implement or integrate a functional search solution.
As an example, for web search, a developer would need to write their own web site spider that populated the Lucene index with Lucene documents.
On the retrieval side, the developer would then need to provide a form handler and query parser that called into the Lucene API for search hits and formatted the results for web presentation.
Given this, its best to think of Lucene as a developer resource and not as a ready to run search engine.
There are several ports of Lucene to other languages. Of note are DotLucene (C# .NET) and Plucene (PERL).
Plucene is currently used by Technorati, is embedded in the Eclipse IDE, and is part of www.furl.com's tools.
2 Reader Comments
Leave a Response
Job Openings View all
| Post a job
|
RSS
- Director of Mobile Applications at Barnes and Noble
- Senior IA / UX Designer at Fox Mobile Group
- Analyst, Serving Customer Intelligence Professionals at Forrester Research
- Senior Sales Rep at Clickability
- Project Manager/Digital Media at TMG
- LiveServer/RedDot/OpenText - CMS Developer at LP Associates
- Lead Developer (Drupal) at Sandusky Newspapers, Inc
- Android Developer at Yelp
Featured Events View all
| Add event
|
RSS
- Apr 21, 2010 – Drupalcon San Francisco 2010
- Apr 22, 2010 – AIIM International Expo 2010
- May 5, 2010 – CMS Expo 2010 (Evanston)
- May 6, 2010 – J Boye Philadelphia 2010
- May 20, 2010 – Gilbane Conference San Francisco

Get the Newsletter
Email It
Stumble It
Add RSS
Processing...


It's Lucene, not PLucene that's used by Technorati. The developer does not have to provide a query parser, Lucene has a good default query parser.
Thanks for the correction.
You are right that there is a query parser as part of Lucene and Plucene. However, there are some very common query sytax expressions that will cause problems with that parser. I strong doubt one would put the "stock" parser into production.
I my experience, its a much more common practice to implement an intermediary parser that handles more syntax cases and one that is often tuned to what the given audience needs/expects.
-Brice