As you know, for a search, there's two components: indexing and searching. webhelpindexer take care of indexing the contents. If you are looking on how to invoke the webhelp indexer, have a look at the "index" target in the build.xml file of docbook webhelp transform (i.e. xsl/webhelp/build.xml) Hope you are familiar with what ANT targets are.
Do note that webhelpindexer is for XHTML transforms. It should work on HTML transforms too if your html files are XML-compatible though it haven't tested.
You can identify the whole process via the ANT build.xml file. But to give a brief description on how to invoke the indexer via command-line,
- You need to have following in your CLASSPATH.
- webhelpindexer.jar, lucene-analyzers-3.0.0.jar, lucene-core-3.0.0.jar - These three are available in the extensions/ directory of docbook-xsl-1.76.1. Go for a XSL snapshot if you can which contains the latest version http://docbook.sourceforge.net/snapshot/
- xercesImpl.jar, xml-apis.jar - These two are available in /usr/share/java directory under Linux distributions. Or you can download them.
- The main class is com.nexwave.nquindexer.IndexerMain
- Give two parameters as command-line arguments:
- The folder with the files needs to be indexed
- (Optional) language. defaults to "en". See build.properties for details.
- You need to wrap the html contents that needs to be indexed by a <div> tag with id "content". i.e. <div id="content"> ... all the html contents except the toc, index etc. </div>
Following is the full command:
java -cp webhelpindexer.jar:lucene-analyzers-3.0.0.jar:lucene-core-3.0.0.jar:/usr/share/java/xercesImpl.jar:/usr/share/java/xml-apis.jar com.nexwave.nquindexer.IndexerMain "/home/kasun/docbook/repository/trunk/xsl/webhelp/docs/content" "en"
That's all for the indexing part. This will create a directory search/ which will contains the index.