Skip Navigation Links

Searching for Information with PHP, Java and Apache Lucene

Posted by AntPhill on October 28th, 2008. Other posts by AntPhill

Search is a fascinating and highly active area in computing. Finding relevant information is, of course, more than just a web browser problem. It affects businesses and organizations the world over.One such business is ZSL Inc.

ZSL were finding it hard to share information between development teams. In particular teams were consistently finding it hard to locate software assets in the company and consequently were re-inventing the wheel with associated time and cost implications!

To solve this problem ZSL developed, in just a few weeks, a sMash application to catalog, index and search their source code and documentation library (PDF, PowerPoint, Word, Excel, and many others). The technology behind this solution is sMash PHP and the Apache Lucene search libraries.

There are several search libraries written in PHP. However, search is a highly compute intensive task and so dynamic languages typically perform poorly in comparison to languages like Java. ZSL opted to maximize performance and used the Java Lucene libraries.

The following developerWorks article shows how to use the PHP to Java bridge and uses the very same Apache Lucene Java libraries as a case study.

Leave a Reply