Package | Description |
---|---|
org.apache.lucene.analysis |
API and code to convert text into indexable/searchable tokens.
|
org.apache.lucene.analysis.standard |
A fast grammar-based tokenizer constructed with JFlex.
|
org.apache.lucene.document |
The logical representation of a
Document for indexing and searching. |
Modifier and Type | Class and Description |
---|---|
class |
ASCIIFoldingFilter
This class converts alphabetic, numeric, and symbolic Unicode characters
which are not in the first 127 ASCII characters (the "Basic Latin" Unicode
block) into their ASCII equivalents, if one exists.
|
class |
CachingTokenFilter
This class can be used if the token attributes of a TokenStream
are intended to be consumed more than once.
|
class |
CharTokenizer
An abstract base class for simple, character-oriented tokenizers.
|
class |
ISOLatin1AccentFilter
Deprecated.
in favor of
ASCIIFoldingFilter which covers a superset
of Latin 1. This class will be removed in Lucene 3.0. |
class |
KeywordTokenizer
Emits the entire input as a single token.
|
class |
LengthFilter
Removes words that are too long or too short from the stream.
|
class |
LetterTokenizer
A LetterTokenizer is a tokenizer that divides text at non-letters.
|
class |
LowerCaseFilter
Normalizes token text to lower case.
|
class |
LowerCaseTokenizer
LowerCaseTokenizer performs the function of LetterTokenizer
and LowerCaseFilter together.
|
class |
NumericTokenStream
Expert: This class provides a
TokenStream
for indexing numeric values that can be used by NumericRangeQuery or NumericRangeFilter . |
class |
PorterStemFilter
Transforms the token stream as per the Porter stemming algorithm.
|
class |
SinkTokenizer
Deprecated.
Use
TeeSinkTokenFilter instead |
class |
StopFilter
Removes stop words from a token stream.
|
class |
TeeSinkTokenFilter
This TokenFilter provides the ability to set aside attribute states
that have already been analyzed.
|
static class |
TeeSinkTokenFilter.SinkTokenStream |
class |
TeeTokenFilter
Deprecated.
Use
TeeSinkTokenFilter instead |
class |
TokenFilter
A TokenFilter is a TokenStream whose input is another TokenStream.
|
class |
Tokenizer
A Tokenizer is a TokenStream whose input is a Reader.
|
class |
WhitespaceTokenizer
A WhitespaceTokenizer is a tokenizer that divides text at whitespace.
|
Modifier and Type | Field and Description |
---|---|
protected TokenStream |
TokenFilter.input
The source of tokens for this filter.
|
Modifier and Type | Method and Description |
---|---|
TokenStream |
SimpleAnalyzer.reusableTokenStream(java.lang.String fieldName,
java.io.Reader reader) |
TokenStream |
PerFieldAnalyzerWrapper.reusableTokenStream(java.lang.String fieldName,
java.io.Reader reader) |
TokenStream |
StopAnalyzer.reusableTokenStream(java.lang.String fieldName,
java.io.Reader reader) |
TokenStream |
Analyzer.reusableTokenStream(java.lang.String fieldName,
java.io.Reader reader)
Creates a TokenStream that is allowed to be re-used
from the previous time that the same thread called
this method.
|
TokenStream |
WhitespaceAnalyzer.reusableTokenStream(java.lang.String fieldName,
java.io.Reader reader) |
TokenStream |
KeywordAnalyzer.reusableTokenStream(java.lang.String fieldName,
java.io.Reader reader) |
TokenStream |
SimpleAnalyzer.tokenStream(java.lang.String fieldName,
java.io.Reader reader) |
TokenStream |
PerFieldAnalyzerWrapper.tokenStream(java.lang.String fieldName,
java.io.Reader reader) |
TokenStream |
StopAnalyzer.tokenStream(java.lang.String fieldName,
java.io.Reader reader)
Filters LowerCaseTokenizer with StopFilter.
|
abstract TokenStream |
Analyzer.tokenStream(java.lang.String fieldName,
java.io.Reader reader)
Creates a TokenStream which tokenizes all the text in the provided
Reader.
|
TokenStream |
WhitespaceAnalyzer.tokenStream(java.lang.String fieldName,
java.io.Reader reader) |
TokenStream |
KeywordAnalyzer.tokenStream(java.lang.String fieldName,
java.io.Reader reader) |
Constructor and Description |
---|
ASCIIFoldingFilter(TokenStream input) |
CachingTokenFilter(TokenStream input) |
ISOLatin1AccentFilter(TokenStream input)
Deprecated.
|
LengthFilter(TokenStream in,
int min,
int max)
Build a filter that removes words that are too long or too
short from the text.
|
LowerCaseFilter(TokenStream in) |
PorterStemFilter(TokenStream in) |
StopFilter(boolean enablePositionIncrements,
TokenStream in,
java.util.Set stopWords)
Constructs a filter which removes words from the input
TokenStream that are named in the Set.
|
StopFilter(boolean enablePositionIncrements,
TokenStream input,
java.util.Set stopWords,
boolean ignoreCase)
Construct a token stream filtering the given input.
|
StopFilter(boolean enablePositionIncrements,
TokenStream input,
java.lang.String[] stopWords)
Deprecated.
Use
StopFilter.StopFilter(boolean, TokenStream, Set) instead. |
StopFilter(boolean enablePositionIncrements,
TokenStream in,
java.lang.String[] stopWords,
boolean ignoreCase)
Deprecated.
|
StopFilter(TokenStream in,
java.util.Set stopWords)
Deprecated.
Use
StopFilter.StopFilter(boolean, TokenStream, Set) instead |
StopFilter(TokenStream input,
java.util.Set stopWords,
boolean ignoreCase)
Deprecated.
|
StopFilter(TokenStream input,
java.lang.String[] stopWords)
Deprecated.
|
StopFilter(TokenStream in,
java.lang.String[] stopWords,
boolean ignoreCase)
Deprecated.
|
TeeSinkTokenFilter(TokenStream input)
Instantiates a new TeeSinkTokenFilter.
|
TeeTokenFilter(TokenStream input,
SinkTokenizer sink)
Deprecated.
|
TokenFilter(TokenStream input)
Construct a token stream filtering the given input.
|
Modifier and Type | Class and Description |
---|---|
class |
StandardFilter
Normalizes tokens extracted with
StandardTokenizer . |
class |
StandardTokenizer
A grammar-based tokenizer constructed with JFlex
|
Modifier and Type | Method and Description |
---|---|
TokenStream |
StandardAnalyzer.reusableTokenStream(java.lang.String fieldName,
java.io.Reader reader)
Deprecated.
|
TokenStream |
StandardAnalyzer.tokenStream(java.lang.String fieldName,
java.io.Reader reader)
|
Constructor and Description |
---|
StandardFilter(TokenStream in)
Construct filtering in.
|
Modifier and Type | Field and Description |
---|---|
protected TokenStream |
AbstractField.tokenStream |
Modifier and Type | Method and Description |
---|---|
TokenStream |
Field.tokenStreamValue()
The TokesStream for this field to be used when indexing, or null.
|
TokenStream |
Fieldable.tokenStreamValue()
The TokenStream for this field to be used when indexing, or null.
|
TokenStream |
NumericField.tokenStreamValue()
Returns a
NumericTokenStream for indexing the numeric value. |
Modifier and Type | Method and Description |
---|---|
void |
Field.setTokenStream(TokenStream tokenStream)
Expert: sets the token stream to be used for indexing and causes isIndexed() and isTokenized() to return true.
|
void |
Field.setValue(TokenStream value)
Deprecated.
|
Constructor and Description |
---|
Field(java.lang.String name,
TokenStream tokenStream)
Create a tokenized and indexed field that is not stored.
|
Field(java.lang.String name,
TokenStream tokenStream,
Field.TermVector termVector)
Create a tokenized and indexed field that is not stored, optionally with
storing term vectors.
|
Copyright © 2000-2014 Apache Software Foundation. All Rights Reserved.