WARNING: This server provides a static reference view of the NetKernel documentation. Links to dynamic content do not work. For the best experience we recommend you install NetKernel and view the documentation in the live system .

SearchConfig.xml

The format of the search configuration is shown below...

<config>
  <analyzer>org.apache.lucene.analysis.standard.StandardAnalyzer</analyzer>
  <maxFieldLength>25000</maxFieldLength>
  <directory>file:/path/to/some/directory/</directory>
  <LuceneVersion>LUCENE_34</LuceneVersion>
</config>

This shows the default settings for analyzer and maxFieldLength.

analyzer is the name of the class of the Lucene Analyzer to use for analysis of resources to be indexed. The default is the StandardAnalyzer.

This module provides a useful (English Language) Porter Stemming Analyser class: org.netkernel.text.search.endpoint.PorterStemmingAnalyzer

maxFieldLength is the the maximum number of terms that will be indexed for a single field in a document. This limits the amount of memory required for indexing, so that collections with very large files will not crash the indexing process by running out of memory. This setting refers to the number of running terms, not to the number of different terms.

directory is optional it may also be specified as a argument when invoking a text search service. If specified it must be a file: URI to a directory (ie it must end with a trailing slash).

If the directory does not exist it will be created automatically when a textIndex or textIndexBatch is performed.

LuceneVersion is optional. By default it uses LUCENE_34. The value of this field must be one of the enum fields specified by the Lucene Version class.