org.ceryle.util.search
Class Indexer

java.lang.Object
  extended by java.lang.Thread
      extended by org.ceryle.util.search.Indexer
All Implemented Interfaces:
Runnable
Direct Known Subclasses:
FSIndexer, XNodeIndexer

public abstract class Indexer
extends Thread

An abstract class that extends Thread to build a search index. This acts as a base class to be extended for various index types. The list of document fields may also be extended as necessary.

Note that this class is not reentrant.

Since:
JDK1.4
Version:
$Id: Indexer.java,v 1.14 2007-06-20 04:41:59 altheim Exp $
Author:
Murray Altheim

Nested Class Summary
 
Nested classes/interfaces inherited from class java.lang.Thread
Thread.State, Thread.UncaughtExceptionHandler
 
Field Summary
static int ANALYZER_SIMPLE
          An identifier for the SimpleAnalyzer used by the search engine.
static int ANALYZER_STANDARD
          An identifier for the StandardAnalyzer used by the search engine.
static int ANALYZER_TYPE
          The Analyzer type to be used by the search engine.
protected  boolean create
          When true, indexing operations will create a new index from scratch rather than modifying an existing index.
protected static org.apache.lucene.analysis.Analyzer m_analyzer
          The Analyzer used by the search engine.
protected  File m_indexDirectory
          The directory used by the search engine for storing its index.
protected  org.apache.lucene.index.IndexWriter m_writer
          The IndexWriter used by the search engine.
protected  MessageHandler mh
           
protected  MessageWriter mw
           
protected  ProgressBar progress
           
static String SEARCH_INDEX_PATH
          The identifier of the property containing the search index path.
protected  Services srvs
           
static int THREAD_PRIORITY
          The default Thread priority (Thread.MAX_PRIORITY).
 
Fields inherited from class java.lang.Thread
MAX_PRIORITY, MIN_PRIORITY, NORM_PRIORITY
 
Constructor Summary
Indexer(String name, MessageWriter msgwriter)
          Constructor with a name identifier (used by the thread group manager), the MessageWriter to receive indexing and search result messages, and a File reference for the index directory.
 
Method Summary
 void closeIndex()
          Close any open index writer.
 void deleteIndex()
          Delete the built index of this Indexer.
 org.apache.lucene.analysis.Analyzer getAnalyzer()
          Return the Analyzer to be used for indexing content.
 File getIndexDirectory()
          Returns the index directory.
 org.apache.lucene.index.IndexReader getIndexReader()
          Returns an IndexReader used to restore the index.
 org.apache.lucene.index.IndexWriter getIndexWriter()
          Return the IndexWriter used by this XNodeIndexer.
 MessageHandler getMessageHandler()
          Returns the MessageHandler, which receives general informational, log, warning and error messages.
 MessageWriter getMessageWriter()
          Returns the MessageWriter, which displays index- and search-related messages directly to the user.
 ProgressBar getProgressBar()
          Returns the optional ProgressBar, or null if it is not set.
static String[] getStopWords()
          Return the stop words being used by the analyzer.
 long getVersion()
          Returns the index version number, -1 if it has not been set.
 boolean hasProgressBar()
          Returns true if the optional ProgressBar has been set.
 void index(boolean createIndex)
          Begins an indexing process.
 boolean isCreateIndex()
          Returns true if the index should be created/recreated when the index is built/rebuilt, rather than just modified.
 boolean isIndexed()
          Returns true if Indexer indicates an index completed state.
 void mergeIndex(Indexer indexer)
          Merge the index provided by indexer with this index.
abstract  boolean restoreIndex()
          Restore the existing index after opening a new session, returning true if successful.
abstract  void run()
           
 void setCreateIndex(boolean createIndex)
          Sets the property that when true indicates that when index are built, they should be built from scratch rather than just modified.
 void setIndexDirectory(File directory)
          Sets the directory used to store the index.
protected  void setIsIndexed(boolean indexed)
          Set the boolean value indicating an index completed/ready state.
 void setProgressBar(ProgressBar progressBar)
          Sets an optional ProgressBar to indicate process status.
 
Methods inherited from class java.lang.Thread
activeCount, checkAccess, countStackFrames, currentThread, destroy, dumpStack, enumerate, getAllStackTraces, getContextClassLoader, getDefaultUncaughtExceptionHandler, getId, getName, getPriority, getStackTrace, getState, getThreadGroup, getUncaughtExceptionHandler, holdsLock, interrupt, interrupted, isAlive, isDaemon, isInterrupted, join, join, join, resume, setContextClassLoader, setDaemon, setDefaultUncaughtExceptionHandler, setName, setPriority, setUncaughtExceptionHandler, sleep, sleep, start, stop, stop, suspend, toString, yield
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

SEARCH_INDEX_PATH

public static final String SEARCH_INDEX_PATH
The identifier of the property containing the search index path.

See Also:
Constant Field Values

ANALYZER_SIMPLE

public static final int ANALYZER_SIMPLE
An identifier for the SimpleAnalyzer used by the search engine.

See Also:
Constant Field Values

ANALYZER_STANDARD

public static final int ANALYZER_STANDARD
An identifier for the StandardAnalyzer used by the search engine.

See Also:
Constant Field Values

ANALYZER_TYPE

public static int ANALYZER_TYPE
The Analyzer type to be used by the search engine. The default The default value is ANALYZER_STANDARD.


m_writer

protected org.apache.lucene.index.IndexWriter m_writer
The IndexWriter used by the search engine.


m_analyzer

protected static org.apache.lucene.analysis.Analyzer m_analyzer
The Analyzer used by the search engine.


m_indexDirectory

protected File m_indexDirectory
The directory used by the search engine for storing its index.


THREAD_PRIORITY

public static int THREAD_PRIORITY
The default Thread priority (Thread.MAX_PRIORITY).


create

protected boolean create
When true, indexing operations will create a new index from scratch rather than modifying an existing index. Default is true.


progress

protected ProgressBar progress

srvs

protected Services srvs

mw

protected MessageWriter mw

mh

protected MessageHandler mh
Constructor Detail

Indexer

public Indexer(String name,
               MessageWriter msgwriter)
Constructor with a name identifier (used by the thread group manager), the MessageWriter to receive indexing and search result messages, and a File reference for the index directory.

Method Detail

index

public void index(boolean createIndex)
Begins an indexing process. If createIndex is true, any existing index will be overwritten. This causes the VM to call Thread's run() method, and should be called in lieu of calling run() directly.


run

public abstract void run()
Specified by:
run in interface Runnable
Overrides:
run in class Thread

getIndexReader

public org.apache.lucene.index.IndexReader getIndexReader()
                                                   throws IOException
Returns an IndexReader used to restore the index.

Throws:
IOException

setIsIndexed

protected final void setIsIndexed(boolean indexed)
Set the boolean value indicating an index completed/ready state. Setting this false sets the index version number to -1.


isIndexed

public final boolean isIndexed()
Returns true if Indexer indicates an index completed state. This should be set false upon starting an index operation, and true by the index Thread upon completion. Subclasses are responsible for setting this member variable.


getVersion

public final long getVersion()
Returns the index version number, -1 if it has not been set.


hasProgressBar

public final boolean hasProgressBar()
Returns true if the optional ProgressBar has been set.


setProgressBar

public final void setProgressBar(ProgressBar progressBar)
Sets an optional ProgressBar to indicate process status.


getProgressBar

public final ProgressBar getProgressBar()
Returns the optional ProgressBar, or null if it is not set.


getMessageWriter

public final MessageWriter getMessageWriter()
Returns the MessageWriter, which displays index- and search-related messages directly to the user.


getMessageHandler

public final MessageHandler getMessageHandler()
Returns the MessageHandler, which receives general informational, log, warning and error messages. This can also be accessed by referring directly to the member variable mh, as in mh.error("an error has occurred").


setIndexDirectory

public void setIndexDirectory(File directory)
                       throws IOException
Sets the directory used to store the index. This is an optional method, as the directory should probably be set via a property within the implementing class. Because the property may be obtained external to the implementing class, this method is provided. The method in this abstract class is empty so as not to force an implementation. Storage of the member variable must occur in the implementing class.

Throws:
IOException

getIndexDirectory

public File getIndexDirectory()
                       throws IOException
Returns the index directory. This method should be subclassed to return the index directory suitable for the index type.

Throws:
IOException

restoreIndex

public abstract boolean restoreIndex()
Restore the existing index after opening a new session, returning true if successful.


setCreateIndex

public final void setCreateIndex(boolean createIndex)
Sets the property that when true indicates that when index are built, they should be built from scratch rather than just modified.


isCreateIndex

public final boolean isCreateIndex()
Returns true if the index should be created/recreated when the index is built/rebuilt, rather than just modified.


getIndexWriter

public org.apache.lucene.index.IndexWriter getIndexWriter()
Return the IndexWriter used by this XNodeIndexer.


deleteIndex

public void deleteIndex()
Delete the built index of this Indexer.


mergeIndex

public void mergeIndex(Indexer indexer)
Merge the index provided by indexer with this index.


closeIndex

public void closeIndex()
Close any open index writer.


getAnalyzer

public org.apache.lucene.analysis.Analyzer getAnalyzer()
Return the Analyzer to be used for indexing content. This is a static, stateless object and can be reused. This method will default to returning a SimpleAnalyzer on unrecognized values.


getStopWords

public static String[] getStopWords()
Return the stop words being used by the analyzer. This is language-dependent, and for non-English languages must be set via the ResourceBundle (see MessageId.STOP_WORDS. For English or other recognized English-speaking Locales (ENGLISH, US, UK, and CANADA), this method returns null, since the constructor for the Analyzer using the default has no parameters.



The Ceryle Project. Copyright ©2001-2007 Murray Altheim, All Rights Reserved. See LICENSE included with distribution.