CLucene - a full-featured, c++ search engine
API Documentation


lucene::index::IndexReader Class Reference

IndexReader is an abstract class, providing an interface for accessing an index. More...

#include <IndexReader.h>

Inheritance diagram for lucene::index::IndexReader:

lucene::index::MultiReader

Public Types

enum  FieldOption {
  ALL = 1, INDEXED = 2, UNINDEXED = 4, INDEXED_WITH_TERMVECTOR = 8,
  INDEXED_NO_TERMVECTOR = 16, TERMVECTOR = 32, TERMVECTOR_WITH_POSITION = 64, TERMVECTOR_WITH_OFFSET = 128,
  TERMVECTOR_WITH_POSITION_OFFSET = 256, STORES_PAYLOADS = 512
}
typedef void(* CloseCallback )(IndexReader *, void *)

Public Member Functions

virtual void doCommit ()=0
 Internal use.
virtual void commit ()
 Do not access this directly, only public so that MultiReader can access it.
void undeleteAll ()
 Undeletes all documents currently marked as deleted in this index.
virtual void getFieldNames (FieldOption fldOption, StringArrayWithDeletor &retarray)=0
 Get a list of unique field names that exist in this index and have the specified field option information.
virtual TCHAR ** getFieldNames ()
virtual TCHAR ** getFieldNames (bool indexed)
virtual uint8_t * norms (const TCHAR *field)=0
 Returns the byte-encoded normalization factor for the named field of every document.
virtual void norms (const TCHAR *field, uint8_t *bytes)=0
 Reads the byte-encoded normalization factor for the named field of every document.
void setNorm (int32_t doc, const TCHAR *field, float_t value)
 Expert: Resets the normalization factor for the named field of the named document.
void setNorm (int32_t doc, const TCHAR *field, uint8_t value)
 Expert: Resets the normalization factor for the named field of the named document.
virtual ~IndexReader ()
 Release the write lock, if needed.
int64_t getVersion ()
 Version number when this IndexReader was opened.
bool isCurrent ()
 Check whether this IndexReader still works on a current version of the index.
virtual bool getTermFreqVectors (int32_t docNumber, lucene::util::Array< TermFreqVector * > &result)=0
 Return an array of term frequency vectors for the specified document.
virtual TermFreqVectorgetTermFreqVector (int32_t docNumber, const TCHAR *field)=0
 Return a term frequency vector for the specified document and field.
virtual int32_t numDocs ()=0
 Returns the number of documents in this index.
virtual int32_t maxDoc () const =0
 Returns one greater than the largest possible document number.
virtual bool document (int32_t n, lucene::document::Document *)=0
 Gets the stored fields of the nth Document in this index.
lucene::document::Documentdocument (const int32_t n)
virtual bool isDeleted (const int32_t n)=0
 Returns true if document n has been deleted.
virtual bool hasDeletions () const =0
 Returns true if any documents have been deleted.
virtual bool hasNorms (const TCHAR *field)
 Returns true if there are norms stored for this field.
virtual TermEnumterms () const =0
 Returns an enumeration of all the terms in the index.
virtual TermEnumterms (const Term *t) const =0
 Returns an enumeration of all terms after a given term.
virtual int32_t docFreq (const Term *t) const =0
 Returns the number of documents containing the term t.
virtual TermPositionstermPositions () const =0
TermPositionstermPositions (Term *term) const
 Returns an enumeration of all the documents which contain term.
virtual TermDocstermDocs () const =0
 Returns an unpositioned TermDocs enumerator.
TermDocstermDocs (Term *term) const
 Returns an enumeration of all the documents which contain term.
void deleteDocument (const int32_t docNum)
 Deletes the document numbered docNum.
void deleteDoc (const int32_t docNum)
int32_t deleteDocuments (Term *term)
 Deletes all documents containing term.
int32_t deleteTerm (Term *term)
void close ()
 Closes files associated with this index and also saves any new deletions to disk.
lucene::store::DirectorygetDirectory ()
 Returns the directory this index resides in.
void addCloseCallback (CloseCallback callback, void *parameter)
 For classes that need to know when the IndexReader closes (such as caches, etc), should pass their callback function to this.

Static Public Member Functions

static IndexReaderopen (const char *path)
 Returns an IndexReader reading the index in an FSDirectory in the named path.
static IndexReaderopen (lucene::store::Directory *directory, bool closeDirectory=false)
 Returns an IndexReader reading the index in the given Directory.
static uint64_t lastModified (const char *directory)
 Returns the time the index in the named directory was last modified.
static uint64_t lastModified (const lucene::store::Directory *directory)
 Returns the time the index in the named directory was last modified.
static int64_t getCurrentVersion (lucene::store::Directory *directory)
 Reads version number from segments files.
static int64_t getCurrentVersion (const char *directory)
 Reads version number from segments files.
static bool indexExists (const char *directory)
 Returns true if an index exists at the specified directory.
static bool indexExists (const lucene::store::Directory *directory)
 Returns true if an index exists at the specified directory.
static bool isLocked (lucene::store::Directory *directory)
 Checks if the index in the named directory is currently locked.
static bool isLocked (const char *directory)
 Checks if the index in the named directory is currently locked.
static void unlock (lucene::store::Directory *directory)
 Forcibly unlocks the index in the named directory.
static void unlock (const char *path)
static bool isLuceneFile (const char *filename)
 Returns true if the file is a lucene filename (based on extension or filename).

Data Fields

Internal * internal
 Internal use.

Protected Member Functions

 IndexReader (lucene::store::Directory *dir)
 Constructor used if IndexReader is not owner of its directory.
 IndexReader (lucene::store::Directory *directory, SegmentInfos *segmentInfos, bool closeDirectory)
 Constructor used if IndexReader is owner of its directory.
virtual void doClose ()=0
 Implements close.
virtual void doSetNorm (int32_t doc, const TCHAR *field, uint8_t value)=0
 Implements setNorm in subclass.
virtual void doUndeleteAll ()=0
 Implements actual undeleteAll() in subclass.
virtual void doDelete (const int32_t docNum)=0
 Implements deletion of the document numbered docNum.

Detailed Description

IndexReader is an abstract class, providing an interface for accessing an index.

Search of an index is done entirely through this abstract interface, so that any subclass which implements it is searchable.

Concrete subclasses of IndexReader are usually constructed with a call to one of the static open() methods, e.g. open(String).

For efficiency, in this API documents are often referred to via document numbers, non-negative integers which each name a unique document in the index. These document numbers are ephemeral--they may change as documents are added to and deleted from an index. Clients should thus not rely on a given document having the same number between sessions.

An IndexReader can be opened on a directory for which an IndexWriter is opened already, but it cannot be used to delete documents from the index then.


Member Typedef Documentation


Member Enumeration Documentation

Enumerator:
ALL 
INDEXED 
UNINDEXED 
INDEXED_WITH_TERMVECTOR 
INDEXED_NO_TERMVECTOR 
TERMVECTOR 
TERMVECTOR_WITH_POSITION 
TERMVECTOR_WITH_OFFSET 
TERMVECTOR_WITH_POSITION_OFFSET 
STORES_PAYLOADS 


Constructor & Destructor Documentation

lucene::index::IndexReader::IndexReader ( lucene::store::Directory dir  )  [protected]

Constructor used if IndexReader is not owner of its directory.

This is used for IndexReaders that are used within other IndexReaders that take care or locking directories.

Parameters:
directory Directory where IndexReader files reside.

lucene::index::IndexReader::IndexReader ( lucene::store::Directory directory,
SegmentInfos *  segmentInfos,
bool  closeDirectory 
) [protected]

Constructor used if IndexReader is owner of its directory.

If IndexReader is owner of its directory, it locks its directory in case of write operations.

Parameters:
directory Directory where IndexReader files reside.
segmentInfos Used for write-l
closeDirectory 

virtual lucene::index::IndexReader::~IndexReader (  )  [virtual]

Release the write lock, if needed.


Member Function Documentation

virtual void lucene::index::IndexReader::doClose (  )  [protected, pure virtual]

Implements close.

Implemented in lucene::index::MultiReader.

virtual void lucene::index::IndexReader::doSetNorm ( int32_t  doc,
const TCHAR *  field,
uint8_t  value 
) [protected, pure virtual]

Implements setNorm in subclass.

Implemented in lucene::index::MultiReader.

virtual void lucene::index::IndexReader::doUndeleteAll (  )  [protected, pure virtual]

Implements actual undeleteAll() in subclass.

Implemented in lucene::index::MultiReader.

virtual void lucene::index::IndexReader::doDelete ( const int32_t  docNum  )  [protected, pure virtual]

Implements deletion of the document numbered docNum.

Applications should call deleteDocument(int32_t) or deleteDocuments(Term*).

Implemented in lucene::index::MultiReader.

virtual void lucene::index::IndexReader::doCommit (  )  [pure virtual]

Internal use.

Implements commit

Implemented in lucene::index::MultiReader.

virtual void lucene::index::IndexReader::commit (  )  [virtual]

Do not access this directly, only public so that MultiReader can access it.

void lucene::index::IndexReader::undeleteAll (  ) 

Undeletes all documents currently marked as deleted in this index.

virtual void lucene::index::IndexReader::getFieldNames ( FieldOption  fldOption,
StringArrayWithDeletor &  retarray 
) [pure virtual]

Get a list of unique field names that exist in this index and have the specified field option information.

Parameters:
fldOption specifies which field option should be available for the returned fields
Returns:
Collection of Strings indicating the names of the fields.
See also:
IndexReader.FieldOption

Implemented in lucene::index::MultiReader.

virtual TCHAR** lucene::index::IndexReader::getFieldNames (  )  [virtual]

virtual TCHAR** lucene::index::IndexReader::getFieldNames ( bool  indexed  )  [virtual]

virtual uint8_t* lucene::index::IndexReader::norms ( const TCHAR *  field  )  [pure virtual]

Returns the byte-encoded normalization factor for the named field of every document.

This is used by the search code to score documents.

The number of bytes returned is the size of the IndexReader->maxDoc() MEMORY: The values are cached, so don't delete the returned byte array.

See also:
Field::setBoost(float_t)

Implemented in lucene::index::MultiReader.

virtual void lucene::index::IndexReader::norms ( const TCHAR *  field,
uint8_t *  bytes 
) [pure virtual]

Reads the byte-encoded normalization factor for the named field of every document.

This is used by the search code to score documents.

See also:
Field::setBoost(float_t)

Implemented in lucene::index::MultiReader.

void lucene::index::IndexReader::setNorm ( int32_t  doc,
const TCHAR *  field,
float_t  value 
)

Expert: Resets the normalization factor for the named field of the named document.

See also:
norms(TCHAR*)

Similarity::decodeNorm(uint8_t)

void lucene::index::IndexReader::setNorm ( int32_t  doc,
const TCHAR *  field,
uint8_t  value 
)

Expert: Resets the normalization factor for the named field of the named document.

The norm represents the product of the field's boost and its int32_t) length normalization. Thus, to preserve the length normalization values when resetting this, one should base the new value upon the old.

See also:
norms(TCHAR*)

Similarity::decodeNorm(uint8_t)

static IndexReader* lucene::index::IndexReader::open ( const char *  path  )  [static]

Returns an IndexReader reading the index in an FSDirectory in the named path.

static IndexReader* lucene::index::IndexReader::open ( lucene::store::Directory directory,
bool  closeDirectory = false 
) [static]

Returns an IndexReader reading the index in the given Directory.

static uint64_t lucene::index::IndexReader::lastModified ( const char *  directory  )  [static]

Returns the time the index in the named directory was last modified.

Do not use this to check whether the reader is still up-to-date, use isCurrent() instead.

static uint64_t lucene::index::IndexReader::lastModified ( const lucene::store::Directory directory  )  [static]

Returns the time the index in the named directory was last modified.

Do not use this to check whether the reader is still up-to-date, use isCurrent() instead.

static int64_t lucene::index::IndexReader::getCurrentVersion ( lucene::store::Directory directory  )  [static]

Reads version number from segments files.

The version number is initialized with a timestamp and then increased by one for each change of the index.

Parameters:
directory where the index resides.
Returns:
version number.
Exceptions:
IOException if segments file cannot be read

static int64_t lucene::index::IndexReader::getCurrentVersion ( const char *  directory  )  [static]

Reads version number from segments files.

The version number is initialized with a timestamp and then increased by one for each change of the index.

Parameters:
directory where the index resides.
Returns:
version number.
Exceptions:
IOException if segments file cannot be read

int64_t lucene::index::IndexReader::getVersion (  ) 

Version number when this IndexReader was opened.

bool lucene::index::IndexReader::isCurrent (  ) 

Check whether this IndexReader still works on a current version of the index.

If this is not the case you will need to re-open the IndexReader to make sure you see the latest changes made to the index.

Exceptions:
IOException 

virtual bool lucene::index::IndexReader::getTermFreqVectors ( int32_t  docNumber,
lucene::util::Array< TermFreqVector * > &  result 
) [pure virtual]

Return an array of term frequency vectors for the specified document.

The array contains a vector for each vectorized field in the document. Each vector contains terms and frequencies for all terms in a given vectorized field. If no such fields existed, the method returns null. The term vectors that are returned my either be of type TermFreqVector or of type TermPositionsVector if positions or offsets have been stored.

Parameters:
docNumber document for which term frequency vectors are returned
Returns:
array of term frequency vectors. May be null if no term vectors have been stored for the specified document.
Exceptions:
IOException if index cannot be accessed
See also:
org.apache.lucene.document.Field.TermVector

Implemented in lucene::index::MultiReader.

virtual TermFreqVector* lucene::index::IndexReader::getTermFreqVector ( int32_t  docNumber,
const TCHAR *  field 
) [pure virtual]

Return a term frequency vector for the specified document and field.

The returned vector contains terms and frequencies for the terms in the specified field of this document, if the field had the storeTermVector flag set. If termvectors had been stored with positions or offsets, a TermPositionsVector is returned.

Parameters:
docNumber document for which the term frequency vector is returned
field field for which the term frequency vector is returned.
Returns:
term frequency vector May be null if field does not exist in the specified document or term vector was not stored.
Exceptions:
IOException if index cannot be accessed
See also:
org.apache.lucene.document.Field.TermVector

Implemented in lucene::index::MultiReader.

static bool lucene::index::IndexReader::indexExists ( const char *  directory  )  [static]

Returns true if an index exists at the specified directory.

If the directory does not exist or if there is no index in it.

Parameters:
directory the directory to check for an index
Returns:
true if an index exists; false otherwise

static bool lucene::index::IndexReader::indexExists ( const lucene::store::Directory directory  )  [static]

Returns true if an index exists at the specified directory.

If the directory does not exist or if there is no index in it.

Parameters:
directory the directory to check for an index
Returns:
true if an index exists; false otherwise
Exceptions:
IOException if there is a problem with accessing the index

virtual int32_t lucene::index::IndexReader::numDocs (  )  [pure virtual]

Returns the number of documents in this index.

Implemented in lucene::index::MultiReader.

virtual int32_t lucene::index::IndexReader::maxDoc (  )  const [pure virtual]

Returns one greater than the largest possible document number.

This may be used to, e.g., determine how big to allocate an array which will have an element for every document number in an index.

Implemented in lucene::index::MultiReader.

virtual bool lucene::index::IndexReader::document ( int32_t  n,
lucene::document::Document  
) [pure virtual]

Gets the stored fields of the nth Document in this index.

The fields are not cleared before retrieving the document, so the object should be new or just cleared.

Implemented in lucene::index::MultiReader.

lucene:: document ::Document* lucene::index::IndexReader::document ( const int32_t  n  ) 

virtual bool lucene::index::IndexReader::isDeleted ( const int32_t  n  )  [pure virtual]

Returns true if document n has been deleted.

Implemented in lucene::index::MultiReader.

virtual bool lucene::index::IndexReader::hasDeletions (  )  const [pure virtual]

Returns true if any documents have been deleted.

Implemented in lucene::index::MultiReader.

virtual bool lucene::index::IndexReader::hasNorms ( const TCHAR *  field  )  [virtual]

Returns true if there are norms stored for this field.

virtual TermEnum* lucene::index::IndexReader::terms (  )  const [pure virtual]

Returns an enumeration of all the terms in the index.

The enumeration is ordered by Term.compareTo(). Each term is greater than all that precede it in the enumeration.

Memory management:
Caller must clean up

Implemented in lucene::index::MultiReader.

virtual TermEnum* lucene::index::IndexReader::terms ( const Term t  )  const [pure virtual]

Returns an enumeration of all terms after a given term.

The enumeration is ordered by Term.compareTo(). Each term is greater than all that precede it in the enumeration.

Memory management:
Caller must clean up

Implemented in lucene::index::MultiReader.

virtual int32_t lucene::index::IndexReader::docFreq ( const Term t  )  const [pure virtual]

Returns the number of documents containing the term t.

Implemented in lucene::index::MultiReader.

virtual TermPositions* lucene::index::IndexReader::termPositions (  )  const [pure virtual]

Implemented in lucene::index::MultiReader.

TermPositions* lucene::index::IndexReader::termPositions ( Term term  )  const

Returns an enumeration of all the documents which contain term.

For each document, in addition to the document number and frequency of the term in that document, a list of all of the ordinal positions of the term in the document is available. Thus, this method implements the mapping:

   =>    <docNum, freq, <pos1, pos2, ... posfreq-1> >*

This positional information faciliates phrase and proximity searching.

The enumeration is ordered by document number. Each document number is greater than all that precede it in the enumeration.

Memory management:
Caller must clean up

virtual TermDocs* lucene::index::IndexReader::termDocs (  )  const [pure virtual]

Returns an unpositioned TermDocs enumerator.

Memory management:
Caller must clean up

Implemented in lucene::index::MultiReader.

TermDocs* lucene::index::IndexReader::termDocs ( Term term  )  const

Returns an enumeration of all the documents which contain term.

For each document, the document number, the frequency of the term in that document is also provided, for use in search scoring. Thus, this method implements the mapping:

   =>    <docNum, freq>*

The enumeration is ordered by document number. Each document number is greater than all that precede it in the enumeration.

Memory management:
Caller must clean up

void lucene::index::IndexReader::deleteDocument ( const int32_t  docNum  ) 

Deletes the document numbered docNum.

Once a document is deleted it will not appear in TermDocs or TermPostitions enumerations. Attempts to read its field with the document method will result in an error. The presence of this document may still be reflected in the docFreq statistic, though this will be corrected eventually as the index is further modified.

void lucene::index::IndexReader::deleteDoc ( const int32_t  docNum  )  [inline]

Deprecated:
. Use deleteDocument instead.

int32_t lucene::index::IndexReader::deleteDocuments ( Term term  ) 

Deletes all documents containing term.

This is useful if one uses a document field to hold a unique ID string for the document. Then to delete such a document, one merely constructs a term with the appropriate field and the unique ID string as its text and passes it to this method. See deleteDocument(int) for information about when this deletion will become effective.

Returns:
the number of documents deleted

int32_t lucene::index::IndexReader::deleteTerm ( Term term  )  [inline]

Deprecated:
. Use deleteDocuments instead.

void lucene::index::IndexReader::close (  ) 

Closes files associated with this index and also saves any new deletions to disk.

No other methods should be called after this has been called.

static bool lucene::index::IndexReader::isLocked ( lucene::store::Directory directory  )  [static]

Checks if the index in the named directory is currently locked.

static bool lucene::index::IndexReader::isLocked ( const char *  directory  )  [static]

Checks if the index in the named directory is currently locked.

static void lucene::index::IndexReader::unlock ( lucene::store::Directory directory  )  [static]

Forcibly unlocks the index in the named directory.

Caution: this should only be used by failure recovery code, when it is known that no other process nor thread is in fact currently accessing this index.

static void lucene::index::IndexReader::unlock ( const char *  path  )  [static]

lucene:: store ::Directory* lucene::index::IndexReader::getDirectory (  ) 

Returns the directory this index resides in.

static bool lucene::index::IndexReader::isLuceneFile ( const char *  filename  )  [static]

Returns true if the file is a lucene filename (based on extension or filename).

void lucene::index::IndexReader::addCloseCallback ( CloseCallback  callback,
void *  parameter 
)

For classes that need to know when the IndexReader closes (such as caches, etc), should pass their callback function to this.


Field Documentation

Internal use.


The documentation for this class was generated from the following file:

clucene.sourceforge.net