org.apache.lucene.analysis
public abstract class Analyzer extends java.lang.Object
Typical implementations first build a Tokenizer, which breaks the stream of characters from the Reader into raw Tokens. One or more TokenFilters may then be applied to the output of the Tokenizer.
Modifier and Type | Field and Description |
---|---|
protected boolean |
overridesTokenStreamMethod |
Constructor and Description |
---|
Analyzer() |
Modifier and Type | Method and Description |
---|---|
void |
close()
Frees persistent resources used by this Analyzer
|
int |
getOffsetGap(Fieldable field)
Just like
getPositionIncrementGap(java.lang.String) , except for
Token offsets instead. |
int |
getPositionIncrementGap(java.lang.String fieldName)
Invoked before indexing a Fieldable instance if
terms have already been added to that field.
|
protected java.lang.Object |
getPreviousTokenStream()
Used by Analyzers that implement reusableTokenStream
to retrieve previously saved TokenStreams for re-use
by the same thread.
|
TokenStream |
reusableTokenStream(java.lang.String fieldName,
java.io.Reader reader)
Creates a TokenStream that is allowed to be re-used
from the previous time that the same thread called
this method.
|
protected void |
setOverridesTokenStreamMethod(java.lang.Class baseClass)
Deprecated.
This is only present to preserve
back-compat of classes that subclass a core analyzer
and override tokenStream but not reusableTokenStream
|
protected void |
setPreviousTokenStream(java.lang.Object obj)
Used by Analyzers that implement reusableTokenStream
to save a TokenStream for later re-use by the same
thread.
|
abstract TokenStream |
tokenStream(java.lang.String fieldName,
java.io.Reader reader)
Creates a TokenStream which tokenizes all the text in the provided
Reader.
|
public abstract TokenStream tokenStream(java.lang.String fieldName, java.io.Reader reader)
public TokenStream reusableTokenStream(java.lang.String fieldName, java.io.Reader reader) throws java.io.IOException
java.io.IOException
protected java.lang.Object getPreviousTokenStream()
protected void setPreviousTokenStream(java.lang.Object obj)
protected void setOverridesTokenStreamMethod(java.lang.Class baseClass)
public int getPositionIncrementGap(java.lang.String fieldName)
fieldName
- Fieldable name being indexed.tokenStream(String,Reader)
public int getOffsetGap(Fieldable field)
getPositionIncrementGap(java.lang.String)
, except for
Token offsets instead. By default this returns 1 for
tokenized fields and, as if the fields were joined
with an extra space character, and 0 for un-tokenized
fields. This method is only called if the field
produced at least one token for indexing.field
- the field just indexedtokenStream(String,Reader)
public void close()
Copyright © 2000-2014 Apache Software Foundation. All Rights Reserved.