CLucene - a full-featured, c++ search engine
API Documentation


lucene::analysis::LowerCaseTokenizer Class Reference

LowerCaseTokenizer performs the function of LetterTokenizer and LowerCaseFilter together. More...

#include <Analyzers.h>

Inheritance diagram for lucene::analysis::LowerCaseTokenizer:

lucene::analysis::LetterTokenizer lucene::analysis::CharTokenizer lucene::analysis::Tokenizer lucene::analysis::TokenStream

Public Member Functions

 LowerCaseTokenizer (lucene::util::Reader *in)
 Construct a new LowerCaseTokenizer.
virtual ~LowerCaseTokenizer ()

Protected Member Functions

TCHAR normalize (const TCHAR chr) const
 Collects only characters which satisfy _totlower.

Detailed Description

LowerCaseTokenizer performs the function of LetterTokenizer and LowerCaseFilter together.

It divides text at non-letters and converts them to lower case. While it is functionally equivalent to the combination of LetterTokenizer and LowerCaseFilter, there is a performance advantage to doing the two tasks at once, hence this (redundant) implementation.

Note: this does a decent job for most European languages, but does a terrible job for some Asian languages, where words are not separated by spaces.


Constructor & Destructor Documentation

lucene::analysis::LowerCaseTokenizer::LowerCaseTokenizer ( lucene::util::Reader in  ) 

Construct a new LowerCaseTokenizer.

virtual lucene::analysis::LowerCaseTokenizer::~LowerCaseTokenizer (  )  [virtual]


Member Function Documentation

TCHAR lucene::analysis::LowerCaseTokenizer::normalize ( const TCHAR  chr  )  const [protected, virtual]

Collects only characters which satisfy _totlower.

Reimplemented from lucene::analysis::CharTokenizer.


The documentation for this class was generated from the following file:

clucene.sourceforge.net