CLucene - a full-featured, c++ search engine
API Documentation


lucene::analysis::CharTokenizer Class Reference

An abstract base class for simple, character-oriented tokenizers. More...

#include <Analyzers.h>

Inheritance diagram for lucene::analysis::CharTokenizer:

lucene::analysis::Tokenizer lucene::analysis::TokenStream lucene::analysis::LetterTokenizer lucene::analysis::WhitespaceTokenizer lucene::analysis::LowerCaseTokenizer

Public Member Functions

 CharTokenizer (lucene::util::Reader *in)
virtual ~CharTokenizer ()
bool next (Token *token)
 Sets token to the next token in the stream, returns false at the EOS.

Protected Member Functions

virtual bool isTokenChar (const TCHAR c) const =0
 Returns true iff a character should be included in a token.
virtual TCHAR normalize (const TCHAR c) const
 Called on each token character to normalize it before it is added to the token.

Detailed Description

An abstract base class for simple, character-oriented tokenizers.


Constructor & Destructor Documentation

lucene::analysis::CharTokenizer::CharTokenizer ( lucene::util::Reader in  ) 

virtual lucene::analysis::CharTokenizer::~CharTokenizer (  )  [virtual]


Member Function Documentation

virtual bool lucene::analysis::CharTokenizer::isTokenChar ( const TCHAR  c  )  const [protected, pure virtual]

Returns true iff a character should be included in a token.

This tokenizer generates as tokens adjacent sequences of characters which satisfy this predicate. Characters for which this is false are used to define token boundaries and are not included in tokens.

Implemented in lucene::analysis::LetterTokenizer, and lucene::analysis::WhitespaceTokenizer.

virtual TCHAR lucene::analysis::CharTokenizer::normalize ( const TCHAR  c  )  const [protected, virtual]

Called on each token character to normalize it before it is added to the token.

The default implementation does nothing. Subclasses may use this to, e.g., lowercase tokens.

Reimplemented in lucene::analysis::LowerCaseTokenizer.

bool lucene::analysis::CharTokenizer::next ( Token token  )  [virtual]

Sets token to the next token in the stream, returns false at the EOS.

Implements lucene::analysis::TokenStream.


The documentation for this class was generated from the following file:

clucene.sourceforge.net