Implementations of the SinkTokenizer that might be useful.
- packValues(String) - Method in class org.apache.lucene.analysis.compound.hyphenation.HyphenationTree
-
Packs the values by storing them in 4 bits, two values into a byte Values
range is from 0 to 9.
- parse(String) - Method in class org.apache.lucene.analysis.compound.hyphenation.PatternParser
-
Parses a hyphenation pattern file.
- parse(File) - Method in class org.apache.lucene.analysis.compound.hyphenation.PatternParser
-
Parses a hyphenation pattern file.
- parse(InputSource) - Method in class org.apache.lucene.analysis.compound.hyphenation.PatternParser
-
Parses a hyphenation pattern file.
- PatternConsumer - Interface in org.apache.lucene.analysis.compound.hyphenation
-
This interface is used to connect the XML pattern file parser to the
hyphenation tree.
- PatternParser - Class in org.apache.lucene.analysis.compound.hyphenation
-
A SAX document handler to read and parse hyphenation patterns from a XML
file.
- PatternParser() - Constructor for class org.apache.lucene.analysis.compound.hyphenation.PatternParser
-
- PatternParser(PatternConsumer) - Constructor for class org.apache.lucene.analysis.compound.hyphenation.PatternParser
-
- payAtt - Variable in class org.apache.lucene.analysis.payloads.DelimitedPayloadTokenFilter
-
- payAtt - Variable in class org.apache.lucene.analysis.payloads.TokenOffsetPayloadTokenFilter
-
- PayloadEncoder - Interface in org.apache.lucene.analysis.payloads
-
Mainly for use with the DelimitedPayloadTokenFilter, converts char buffers to Payload.
- PayloadHelper - Class in org.apache.lucene.analysis.payloads
-
Utility methods for encoding payloads.
- PayloadHelper() - Constructor for class org.apache.lucene.analysis.payloads.PayloadHelper
-
- permutationIterator() - Method in class org.apache.lucene.analysis.shingle.ShingleMatrixFilter.Matrix
-
- PersianAnalyzer - Class in org.apache.lucene.analysis.fa
-
Analyzer
for Persian.
- PersianAnalyzer() - Constructor for class org.apache.lucene.analysis.fa.PersianAnalyzer
-
- PersianAnalyzer(Version) - Constructor for class org.apache.lucene.analysis.fa.PersianAnalyzer
-
- PersianAnalyzer(String[]) - Constructor for class org.apache.lucene.analysis.fa.PersianAnalyzer
-
- PersianAnalyzer(Version, String[]) - Constructor for class org.apache.lucene.analysis.fa.PersianAnalyzer
-
Builds an analyzer with the given stop words.
- PersianAnalyzer(Hashtable) - Constructor for class org.apache.lucene.analysis.fa.PersianAnalyzer
-
- PersianAnalyzer(Version, Hashtable) - Constructor for class org.apache.lucene.analysis.fa.PersianAnalyzer
-
Builds an analyzer with the given stop words.
- PersianAnalyzer(File) - Constructor for class org.apache.lucene.analysis.fa.PersianAnalyzer
-
- PersianAnalyzer(Version, File) - Constructor for class org.apache.lucene.analysis.fa.PersianAnalyzer
-
Builds an analyzer with the given stop words.
- PersianNormalizationFilter - Class in org.apache.lucene.analysis.fa
-
- PersianNormalizationFilter(TokenStream) - Constructor for class org.apache.lucene.analysis.fa.PersianNormalizationFilter
-
- PersianNormalizer - Class in org.apache.lucene.analysis.fa
-
Normalizer for Persian.
- PersianNormalizer() - Constructor for class org.apache.lucene.analysis.fa.PersianNormalizer
-
- PositionFilter - Class in org.apache.lucene.analysis.position
-
Set the positionIncrement of all tokens to the "positionIncrement",
except the first return token which retains its original positionIncrement value.
- PositionFilter(TokenStream) - Constructor for class org.apache.lucene.analysis.position.PositionFilter
-
Constructs a PositionFilter that assigns a position increment of zero to
all but the first token from the given input stream.
- PositionFilter(TokenStream, int) - Constructor for class org.apache.lucene.analysis.position.PositionFilter
-
Constructs a PositionFilter that assigns the given position increment to
all but the first token from the given input stream.
- postBreak - Variable in class org.apache.lucene.analysis.compound.hyphenation.Hyphen
-
- preBreak - Variable in class org.apache.lucene.analysis.compound.hyphenation.Hyphen
-
- PrefixAndSuffixAwareTokenFilter - Class in org.apache.lucene.analysis.miscellaneous
-
- PrefixAndSuffixAwareTokenFilter(TokenStream, TokenStream, TokenStream) - Constructor for class org.apache.lucene.analysis.miscellaneous.PrefixAndSuffixAwareTokenFilter
-
- PrefixAwareTokenFilter - Class in org.apache.lucene.analysis.miscellaneous
-
Joins two token streams and leaves the last token of the first stream available
to be used when updating the token values in the second stream based on that token.
- PrefixAwareTokenFilter(TokenStream, TokenStream) - Constructor for class org.apache.lucene.analysis.miscellaneous.PrefixAwareTokenFilter
-
- prefixes - Static variable in class org.apache.lucene.analysis.ar.ArabicStemmer
-
- printStats() - Method in class org.apache.lucene.analysis.compound.hyphenation.HyphenationTree
-
- printStats() - Method in class org.apache.lucene.analysis.compound.hyphenation.TernaryTree
-
- PUA_EC00_MARKER - Static variable in class org.apache.lucene.analysis.reverse.ReverseStringFilter
-
Example marker character: U+EC00 (PRIVATE USE AREA: EC00)
- put(int, byte) - Method in class org.apache.lucene.analysis.compound.hyphenation.ByteVector
-
- put(int, char) - Method in class org.apache.lucene.analysis.compound.hyphenation.CharVector
-
- readToken(StringBuffer) - Method in class org.apache.lucene.analysis.compound.hyphenation.PatternParser
-
- reset() - Method in class org.apache.lucene.analysis.cjk.CJKTokenizer
-
- reset(Reader) - Method in class org.apache.lucene.analysis.cjk.CJKTokenizer
-
- reset() - Method in class org.apache.lucene.analysis.cn.ChineseTokenizer
-
- reset(Reader) - Method in class org.apache.lucene.analysis.cn.ChineseTokenizer
-
- reset() - Method in class org.apache.lucene.analysis.compound.CompoundWordTokenFilterBase
-
- reset() - Method in class org.apache.lucene.analysis.miscellaneous.PrefixAndSuffixAwareTokenFilter
-
- reset() - Method in class org.apache.lucene.analysis.miscellaneous.PrefixAwareTokenFilter
-
- reset() - Method in class org.apache.lucene.analysis.miscellaneous.SingleTokenTokenStream
-
- reset() - Method in class org.apache.lucene.analysis.ngram.EdgeNGramTokenFilter
-
- reset(Reader) - Method in class org.apache.lucene.analysis.ngram.EdgeNGramTokenizer
-
- reset() - Method in class org.apache.lucene.analysis.ngram.EdgeNGramTokenizer
-
- reset() - Method in class org.apache.lucene.analysis.ngram.NGramTokenFilter
-
- reset(Reader) - Method in class org.apache.lucene.analysis.ngram.NGramTokenizer
-
- reset() - Method in class org.apache.lucene.analysis.ngram.NGramTokenizer
-
- reset() - Method in class org.apache.lucene.analysis.position.PositionFilter
-
- reset() - Method in class org.apache.lucene.analysis.shingle.ShingleFilter
-
- reset() - Method in class org.apache.lucene.analysis.shingle.ShingleMatrixFilter
-
- reset() - Method in class org.apache.lucene.analysis.sinks.TokenRangeSinkFilter
-
- reset() - Method in class org.apache.lucene.analysis.sinks.TokenRangeSinkTokenizer
-
Deprecated.
- reset() - Method in class org.apache.lucene.analysis.th.ThaiWordFilter
-
- resolveEntity(String, String) - Method in class org.apache.lucene.analysis.compound.hyphenation.PatternParser
-
- reusableTokenStream(String, Reader) - Method in class org.apache.lucene.analysis.ar.ArabicAnalyzer
-
Returns a (possibly reused) TokenStream
which tokenizes all the text
in the provided Reader
.
- reusableTokenStream(String, Reader) - Method in class org.apache.lucene.analysis.br.BrazilianAnalyzer
-
Returns a (possibly reused) TokenStream
which tokenizes all the text
in the provided Reader
.
- reusableTokenStream(String, Reader) - Method in class org.apache.lucene.analysis.cjk.CJKAnalyzer
-
Returns a (possibly reused) TokenStream
which tokenizes all the text
in the provided Reader
.
- reusableTokenStream(String, Reader) - Method in class org.apache.lucene.analysis.cn.ChineseAnalyzer
-
Returns a (possibly reused) TokenStream
which tokenizes all the text in the
provided Reader
.
- reusableTokenStream(String, Reader) - Method in class org.apache.lucene.analysis.cz.CzechAnalyzer
-
Returns a (possibly reused) TokenStream
which tokenizes all the text in
the provided Reader
.
- reusableTokenStream(String, Reader) - Method in class org.apache.lucene.analysis.de.GermanAnalyzer
-
Returns a (possibly reused) TokenStream
which tokenizes all the text
in the provided Reader
.
- reusableTokenStream(String, Reader) - Method in class org.apache.lucene.analysis.el.GreekAnalyzer
-
Returns a (possibly reused) TokenStream
which tokenizes all the text
in the provided Reader
.
- reusableTokenStream(String, Reader) - Method in class org.apache.lucene.analysis.fa.PersianAnalyzer
-
Returns a (possibly reused) TokenStream
which tokenizes all the text
in the provided Reader
.
- reusableTokenStream(String, Reader) - Method in class org.apache.lucene.analysis.fr.FrenchAnalyzer
-
Returns a (possibly reused) TokenStream
which tokenizes all the
text in the provided Reader
.
- reusableTokenStream(String, Reader) - Method in class org.apache.lucene.analysis.nl.DutchAnalyzer
-
Returns a (possibly reused) TokenStream
which tokenizes all the
text in the provided Reader
.
- reusableTokenStream(String, Reader) - Method in class org.apache.lucene.analysis.query.QueryAutoStopWordAnalyzer
-
- reusableTokenStream(String, Reader) - Method in class org.apache.lucene.analysis.ru.RussianAnalyzer
-
Returns a (possibly reused) TokenStream
which tokenizes all the text
in the provided Reader
.
- reusableTokenStream(String, Reader) - Method in class org.apache.lucene.analysis.shingle.ShingleAnalyzerWrapper
-
- reusableTokenStream(String, Reader) - Method in class org.apache.lucene.analysis.th.ThaiAnalyzer
-
- reverse(String) - Static method in class org.apache.lucene.analysis.reverse.ReverseStringFilter
-
- reverse(char[]) - Static method in class org.apache.lucene.analysis.reverse.ReverseStringFilter
-
- reverse(char[], int) - Static method in class org.apache.lucene.analysis.reverse.ReverseStringFilter
-
- reverse(char[], int, int) - Static method in class org.apache.lucene.analysis.reverse.ReverseStringFilter
-
- ReverseStringFilter - Class in org.apache.lucene.analysis.reverse
-
Reverse token string, for example "country" => "yrtnuoc".
- ReverseStringFilter(TokenStream) - Constructor for class org.apache.lucene.analysis.reverse.ReverseStringFilter
-
Create a new ReverseStringFilter that reverses all tokens in the
supplied TokenStream
.
- ReverseStringFilter(TokenStream, char) - Constructor for class org.apache.lucene.analysis.reverse.ReverseStringFilter
-
Create a new ReverseStringFilter that reverses and marks all tokens in the
supplied TokenStream
.
- rewind() - Method in class org.apache.lucene.analysis.compound.hyphenation.TernaryTree.Iterator
-
- root - Variable in class org.apache.lucene.analysis.compound.hyphenation.TernaryTree
-
- RTL_DIRECTION_MARKER - Static variable in class org.apache.lucene.analysis.reverse.ReverseStringFilter
-
Example marker character: U+200F (RIGHT-TO-LEFT MARK)
- RussianAnalyzer - Class in org.apache.lucene.analysis.ru
-
Analyzer
for Russian language.
- RussianAnalyzer() - Constructor for class org.apache.lucene.analysis.ru.RussianAnalyzer
-
- RussianAnalyzer(Version) - Constructor for class org.apache.lucene.analysis.ru.RussianAnalyzer
-
- RussianAnalyzer(char[]) - Constructor for class org.apache.lucene.analysis.ru.RussianAnalyzer
-
- RussianAnalyzer(char[], String[]) - Constructor for class org.apache.lucene.analysis.ru.RussianAnalyzer
-
- RussianAnalyzer(String[]) - Constructor for class org.apache.lucene.analysis.ru.RussianAnalyzer
-
- RussianAnalyzer(Version, String[]) - Constructor for class org.apache.lucene.analysis.ru.RussianAnalyzer
-
Builds an analyzer with the given stop words.
- RussianAnalyzer(char[], Map) - Constructor for class org.apache.lucene.analysis.ru.RussianAnalyzer
-
- RussianAnalyzer(Map) - Constructor for class org.apache.lucene.analysis.ru.RussianAnalyzer
-
- RussianAnalyzer(Version, Map) - Constructor for class org.apache.lucene.analysis.ru.RussianAnalyzer
-
Builds an analyzer with the given stop words.
- RussianCharsets - Class in org.apache.lucene.analysis.ru
-
Deprecated.
Support for non-Unicode encodings will be removed in Lucene 3.0
- RussianCharsets() - Constructor for class org.apache.lucene.analysis.ru.RussianCharsets
-
Deprecated.
- RussianLetterTokenizer - Class in org.apache.lucene.analysis.ru
-
A RussianLetterTokenizer is a Tokenizer
that extends LetterTokenizer
by additionally looking up letters in a given "russian charset".
- RussianLetterTokenizer(Reader, char[]) - Constructor for class org.apache.lucene.analysis.ru.RussianLetterTokenizer
-
- RussianLetterTokenizer(Reader) - Constructor for class org.apache.lucene.analysis.ru.RussianLetterTokenizer
-
- RussianLetterTokenizer(AttributeSource, Reader) - Constructor for class org.apache.lucene.analysis.ru.RussianLetterTokenizer
-
- RussianLetterTokenizer(AttributeSource.AttributeFactory, Reader) - Constructor for class org.apache.lucene.analysis.ru.RussianLetterTokenizer
-
- RussianLowerCaseFilter - Class in org.apache.lucene.analysis.ru
-
Normalizes token text to lower case, analyzing given ("russian") charset.
- RussianLowerCaseFilter(TokenStream, char[]) - Constructor for class org.apache.lucene.analysis.ru.RussianLowerCaseFilter
-
- RussianLowerCaseFilter(TokenStream) - Constructor for class org.apache.lucene.analysis.ru.RussianLowerCaseFilter
-
- RussianStemFilter - Class in org.apache.lucene.analysis.ru
-
A TokenFilter
that stems Russian words.
- RussianStemFilter(TokenStream, char[]) - Constructor for class org.apache.lucene.analysis.ru.RussianStemFilter
-
- RussianStemFilter(TokenStream) - Constructor for class org.apache.lucene.analysis.ru.RussianStemFilter
-
- sameRow - Static variable in class org.apache.lucene.analysis.shingle.ShingleMatrixFilter.TokenPositioner
-
- sc - Variable in class org.apache.lucene.analysis.compound.hyphenation.TernaryTree
-
The character stored in this node: splitchar.
- searchPatterns(char[], int, byte[]) - Method in class org.apache.lucene.analysis.compound.hyphenation.HyphenationTree
-
Search for all possible partial matches of word starting at index an update
interletter values.
- setArticles(Set) - Method in class org.apache.lucene.analysis.fr.ElisionFilter
-
- setConsumer(PatternConsumer) - Method in class org.apache.lucene.analysis.compound.hyphenation.PatternParser
-
- setExclusionSet(Set) - Method in class org.apache.lucene.analysis.de.GermanStemFilter
-
Set an alternative exclusion list for this filter.
- setExclusionTable(Map) - Method in class org.apache.lucene.analysis.fr.FrenchStemFilter
-
Set an alternative exclusion list for this filter.
- setExclusionTable(HashSet) - Method in class org.apache.lucene.analysis.nl.DutchStemFilter
-
Set an alternative exclusion list for this filter.
- setFirst(boolean) - Method in class org.apache.lucene.analysis.shingle.ShingleMatrixFilter.Matrix.Column
-
- setIgnoringSinglePrefixOrSuffixShingle(boolean) - Method in class org.apache.lucene.analysis.shingle.ShingleMatrixFilter
-
- setLast(boolean) - Method in class org.apache.lucene.analysis.shingle.ShingleMatrixFilter.Matrix.Column
-
- setMatrix(ShingleMatrixFilter.Matrix) - Method in class org.apache.lucene.analysis.shingle.ShingleMatrixFilter
-
- setMaximumShingleSize(int) - Method in class org.apache.lucene.analysis.shingle.ShingleMatrixFilter
-
- setMaxShingleSize(int) - Method in class org.apache.lucene.analysis.shingle.ShingleAnalyzerWrapper
-
Set the maximum size of output shingles
- setMaxShingleSize(int) - Method in class org.apache.lucene.analysis.shingle.ShingleFilter
-
Set the max shingle size (default: 2)
- setMinimumShingleSize(int) - Method in class org.apache.lucene.analysis.shingle.ShingleMatrixFilter
-
- setOutputUnigrams(boolean) - Method in class org.apache.lucene.analysis.shingle.ShingleAnalyzerWrapper
-
Shall the filter pass the original tokens (the "unigrams") to the output
stream?
- setOutputUnigrams(boolean) - Method in class org.apache.lucene.analysis.shingle.ShingleFilter
-
Shall the output stream contain the input tokens (unigrams) as well as
shingles? (default: true.)
- setPrefix(TokenStream) - Method in class org.apache.lucene.analysis.miscellaneous.PrefixAwareTokenFilter
-
- setSpacerCharacter(Character) - Method in class org.apache.lucene.analysis.shingle.ShingleMatrixFilter
-
- setStemDictionary(File) - Method in class org.apache.lucene.analysis.nl.DutchAnalyzer
-
Reads a stemdictionary file , that overrules the stemming algorithm
This is a textfile that contains per line
word\tstem, i.e: two tab seperated words
- setStemDictionary(HashMap) - Method in class org.apache.lucene.analysis.nl.DutchStemFilter
-
Set dictionary for stemming, this dictionary overrules the algorithm,
so you can correct for a particular unwanted word-stem pair.
- setStemExclusionTable(String[]) - Method in class org.apache.lucene.analysis.br.BrazilianAnalyzer
-
Builds an exclusionlist from an array of Strings.
- setStemExclusionTable(Map) - Method in class org.apache.lucene.analysis.br.BrazilianAnalyzer
-
Builds an exclusionlist from a Map
.
- setStemExclusionTable(File) - Method in class org.apache.lucene.analysis.br.BrazilianAnalyzer
-
Builds an exclusionlist from the words contained in the given file.
- setStemExclusionTable(String[]) - Method in class org.apache.lucene.analysis.de.GermanAnalyzer
-
Builds an exclusionlist from an array of Strings.
- setStemExclusionTable(Map) - Method in class org.apache.lucene.analysis.de.GermanAnalyzer
-
Builds an exclusionlist from a Map
- setStemExclusionTable(File) - Method in class org.apache.lucene.analysis.de.GermanAnalyzer
-
Builds an exclusionlist from the words contained in the given file.
- setStemExclusionTable(String[]) - Method in class org.apache.lucene.analysis.fr.FrenchAnalyzer
-
Builds an exclusionlist from an array of Strings.
- setStemExclusionTable(Map) - Method in class org.apache.lucene.analysis.fr.FrenchAnalyzer
-
Builds an exclusionlist from a Map.
- setStemExclusionTable(File) - Method in class org.apache.lucene.analysis.fr.FrenchAnalyzer
-
Builds an exclusionlist from the words contained in the given file.
- setStemExclusionTable(String[]) - Method in class org.apache.lucene.analysis.nl.DutchAnalyzer
-
Builds an exclusionlist from an array of Strings.
- setStemExclusionTable(HashSet) - Method in class org.apache.lucene.analysis.nl.DutchAnalyzer
-
Builds an exclusionlist from a Hashtable.
- setStemExclusionTable(File) - Method in class org.apache.lucene.analysis.nl.DutchAnalyzer
-
Builds an exclusionlist from the words contained in the given file.
- setStemmer(GermanStemmer) - Method in class org.apache.lucene.analysis.de.GermanStemFilter
-
- setStemmer(FrenchStemmer) - Method in class org.apache.lucene.analysis.fr.FrenchStemFilter
-
- setStemmer(DutchStemmer) - Method in class org.apache.lucene.analysis.nl.DutchStemFilter
-
- setStemmer(RussianStemmer) - Method in class org.apache.lucene.analysis.ru.RussianStemFilter
-
Set a alternative/custom RussianStemmer
for this filter.
- setSuffix(TokenStream) - Method in class org.apache.lucene.analysis.miscellaneous.PrefixAwareTokenFilter
-
- setToken(Token) - Method in class org.apache.lucene.analysis.miscellaneous.SingleTokenTokenStream
-
- setTokenPositioner(Token, ShingleMatrixFilter.TokenPositioner) - Method in class org.apache.lucene.analysis.shingle.ShingleMatrixFilter.OneDimensionalNonWeightedTokenSettingsCodec
-
- setTokenPositioner(Token, ShingleMatrixFilter.TokenPositioner) - Method in class org.apache.lucene.analysis.shingle.ShingleMatrixFilter.SimpleThreeDimensionalTokenSettingsCodec
-
Sets the TokenPositioner as token flags int value.
- setTokenPositioner(Token, ShingleMatrixFilter.TokenPositioner) - Method in class org.apache.lucene.analysis.shingle.ShingleMatrixFilter.TokenSettingsCodec
-
- setTokenPositioner(Token, ShingleMatrixFilter.TokenPositioner) - Method in class org.apache.lucene.analysis.shingle.ShingleMatrixFilter.TwoDimensionalNonWeightedSynonymTokenSettingsCodec
-
- setTokens(List) - Method in class org.apache.lucene.analysis.shingle.ShingleMatrixFilter.Matrix.Column.Row
-
- setTokenType(String) - Method in class org.apache.lucene.analysis.shingle.ShingleFilter
-
Set the type of the shingle tokens produced by this filter.
- setWeight(Token, float) - Method in class org.apache.lucene.analysis.shingle.ShingleMatrixFilter.OneDimensionalNonWeightedTokenSettingsCodec
-
- setWeight(Token, float) - Method in class org.apache.lucene.analysis.shingle.ShingleMatrixFilter.SimpleThreeDimensionalTokenSettingsCodec
-
Stores a 32 bit float in the payload, or set it to null if 1f;
- setWeight(Token, float) - Method in class org.apache.lucene.analysis.shingle.ShingleMatrixFilter.TokenSettingsCodec
-
Have this method do nothing in order to 'disable' weights.
- setWeight(Token, float) - Method in class org.apache.lucene.analysis.shingle.ShingleMatrixFilter.TwoDimensionalNonWeightedSynonymTokenSettingsCodec
-
- SHADDA - Static variable in class org.apache.lucene.analysis.ar.ArabicNormalizer
-
- ShingleAnalyzerWrapper - Class in org.apache.lucene.analysis.shingle
-
A ShingleAnalyzerWrapper wraps a
ShingleFilter
around another
Analyzer
.
- ShingleAnalyzerWrapper(Analyzer) - Constructor for class org.apache.lucene.analysis.shingle.ShingleAnalyzerWrapper
-
- ShingleAnalyzerWrapper(Analyzer, int) - Constructor for class org.apache.lucene.analysis.shingle.ShingleAnalyzerWrapper
-
- ShingleAnalyzerWrapper() - Constructor for class org.apache.lucene.analysis.shingle.ShingleAnalyzerWrapper
-
Wraps StandardAnalyzer
.
- ShingleAnalyzerWrapper(int) - Constructor for class org.apache.lucene.analysis.shingle.ShingleAnalyzerWrapper
-
- ShingleFilter - Class in org.apache.lucene.analysis.shingle
-
A ShingleFilter constructs shingles (token n-grams) from a token stream.
- ShingleFilter(TokenStream, int) - Constructor for class org.apache.lucene.analysis.shingle.ShingleFilter
-
Constructs a ShingleFilter with the specified single size from the
TokenStream
input
- ShingleFilter(TokenStream) - Constructor for class org.apache.lucene.analysis.shingle.ShingleFilter
-
Construct a ShingleFilter with default shingle size.
- ShingleFilter(TokenStream, String) - Constructor for class org.apache.lucene.analysis.shingle.ShingleFilter
-
Construct a ShingleFilter with the specified token type for shingle tokens.
- ShingleMatrixFilter - Class in org.apache.lucene.analysis.shingle
-
A ShingleMatrixFilter constructs shingles (token n-grams) from a token stream.
- ShingleMatrixFilter(ShingleMatrixFilter.Matrix, int, int, Character, boolean, ShingleMatrixFilter.TokenSettingsCodec) - Constructor for class org.apache.lucene.analysis.shingle.ShingleMatrixFilter
-
Creates a shingle filter based on a user defined matrix.
- ShingleMatrixFilter(TokenStream, int, int) - Constructor for class org.apache.lucene.analysis.shingle.ShingleMatrixFilter
-
Creates a shingle filter using default settings.
- ShingleMatrixFilter(TokenStream, int, int, Character) - Constructor for class org.apache.lucene.analysis.shingle.ShingleMatrixFilter
-
Creates a shingle filter using default settings.
- ShingleMatrixFilter(TokenStream, int, int, Character, boolean) - Constructor for class org.apache.lucene.analysis.shingle.ShingleMatrixFilter
-
- ShingleMatrixFilter(TokenStream, int, int, Character, boolean, ShingleMatrixFilter.TokenSettingsCodec) - Constructor for class org.apache.lucene.analysis.shingle.ShingleMatrixFilter
-
Creates a shingle filter with ad hoc parameter settings.
- ShingleMatrixFilter.Matrix - Class in org.apache.lucene.analysis.shingle
-
A column focused matrix in three dimensions:
- ShingleMatrixFilter.Matrix() - Constructor for class org.apache.lucene.analysis.shingle.ShingleMatrixFilter.Matrix
-
- ShingleMatrixFilter.Matrix.Column - Class in org.apache.lucene.analysis.shingle
-
- ShingleMatrixFilter.Matrix.Column(Token) - Constructor for class org.apache.lucene.analysis.shingle.ShingleMatrixFilter.Matrix.Column
-
- ShingleMatrixFilter.Matrix.Column() - Constructor for class org.apache.lucene.analysis.shingle.ShingleMatrixFilter.Matrix.Column
-
- ShingleMatrixFilter.Matrix.Column.Row - Class in org.apache.lucene.analysis.shingle
-
- ShingleMatrixFilter.Matrix.Column.Row() - Constructor for class org.apache.lucene.analysis.shingle.ShingleMatrixFilter.Matrix.Column.Row
-
- ShingleMatrixFilter.OneDimensionalNonWeightedTokenSettingsCodec - Class in org.apache.lucene.analysis.shingle
-
- ShingleMatrixFilter.OneDimensionalNonWeightedTokenSettingsCodec() - Constructor for class org.apache.lucene.analysis.shingle.ShingleMatrixFilter.OneDimensionalNonWeightedTokenSettingsCodec
-
- ShingleMatrixFilter.SimpleThreeDimensionalTokenSettingsCodec - Class in org.apache.lucene.analysis.shingle
-
A full featured codec not to be used for something serious.
- ShingleMatrixFilter.SimpleThreeDimensionalTokenSettingsCodec() - Constructor for class org.apache.lucene.analysis.shingle.ShingleMatrixFilter.SimpleThreeDimensionalTokenSettingsCodec
-
- ShingleMatrixFilter.TokenPositioner - Class in org.apache.lucene.analysis.shingle
-
- ShingleMatrixFilter.TokenSettingsCodec - Class in org.apache.lucene.analysis.shingle
-
Strategy used to code and decode meta data of the tokens from the input stream
regarding how to position the tokens in the matrix, set and retreive weight, et c.
- ShingleMatrixFilter.TokenSettingsCodec() - Constructor for class org.apache.lucene.analysis.shingle.ShingleMatrixFilter.TokenSettingsCodec
-
- ShingleMatrixFilter.TwoDimensionalNonWeightedSynonymTokenSettingsCodec - Class in org.apache.lucene.analysis.shingle
-
A codec that creates a two dimensional matrix
by treating tokens from the input stream with 0 position increment
as new rows to the current column.
- ShingleMatrixFilter.TwoDimensionalNonWeightedSynonymTokenSettingsCodec() - Constructor for class org.apache.lucene.analysis.shingle.ShingleMatrixFilter.TwoDimensionalNonWeightedSynonymTokenSettingsCodec
-
- SingleTokenTokenStream - Class in org.apache.lucene.analysis.miscellaneous
-
A TokenStream
containing a single token.
- SingleTokenTokenStream(Token) - Constructor for class org.apache.lucene.analysis.miscellaneous.SingleTokenTokenStream
-
- size() - Method in class org.apache.lucene.analysis.compound.hyphenation.TernaryTree
-
- START_OF_HEADING_MARKER - Static variable in class org.apache.lucene.analysis.reverse.ReverseStringFilter
-
Example marker character: U+0001 (START OF HEADING)
- startElement(String, String, String, Attributes) - Method in class org.apache.lucene.analysis.compound.hyphenation.PatternParser
-
- stem(char[], int) - Method in class org.apache.lucene.analysis.ar.ArabicStemmer
-
Stem an input buffer of Arabic text.
- stem(String) - Method in class org.apache.lucene.analysis.br.BrazilianStemmer
-
Stems the given term to an unique discriminator.
- stem(String) - Method in class org.apache.lucene.analysis.de.GermanStemmer
-
Stemms the given term to an unique discriminator.
- stem(String) - Method in class org.apache.lucene.analysis.fr.FrenchStemmer
-
Stems the given term to a unique discriminator.
- stem(String) - Method in class org.apache.lucene.analysis.nl.DutchStemmer
-
- stemmer - Variable in class org.apache.lucene.analysis.ar.ArabicStemFilter
-
- stemPrefix(char[], int) - Method in class org.apache.lucene.analysis.ar.ArabicStemmer
-
Stem a prefix off an Arabic word.
- stemSuffix(char[], int) - Method in class org.apache.lucene.analysis.ar.ArabicStemmer
-
Stem suffix(es) off an Arabic word.
- STOP_WORDS - Static variable in class org.apache.lucene.analysis.cjk.CJKAnalyzer
-
An array containing some common English words that are not usually
useful for searching and some double-byte interpunctions.
- STOP_WORDS - Static variable in class org.apache.lucene.analysis.cn.ChineseFilter
-
- stoplist - Variable in class org.apache.lucene.analysis.compound.hyphenation.HyphenationTree
-
This map stores hyphenation exceptions
- STOPWORDS_COMMENT - Static variable in class org.apache.lucene.analysis.ar.ArabicAnalyzer
-
The comment character in the stopwords file.
- STOPWORDS_COMMENT - Static variable in class org.apache.lucene.analysis.fa.PersianAnalyzer
-
The comment character in the stopwords file.
- strcmp(char[], int, char[], int) - Static method in class org.apache.lucene.analysis.compound.hyphenation.TernaryTree
-
Compares 2 null terminated char arrays
- strcmp(String, char[], int) - Static method in class org.apache.lucene.analysis.compound.hyphenation.TernaryTree
-
Compares a string with null terminated char array
- strcpy(char[], int, char[], int) - Static method in class org.apache.lucene.analysis.compound.hyphenation.TernaryTree
-
- strlen(char[], int) - Static method in class org.apache.lucene.analysis.compound.hyphenation.TernaryTree
-
- strlen(char[]) - Static method in class org.apache.lucene.analysis.compound.hyphenation.TernaryTree
-
- suffixes - Static variable in class org.apache.lucene.analysis.ar.ArabicStemmer
-
- SUKUN - Static variable in class org.apache.lucene.analysis.ar.ArabicNormalizer
-
- TATWEEL - Static variable in class org.apache.lucene.analysis.ar.ArabicNormalizer
-
- TEH - Static variable in class org.apache.lucene.analysis.ar.ArabicStemmer
-
- TEH_MARBUTA - Static variable in class org.apache.lucene.analysis.ar.ArabicNormalizer
-
- TEH_MARBUTA - Static variable in class org.apache.lucene.analysis.ar.ArabicStemmer
-
- termAtt - Variable in class org.apache.lucene.analysis.payloads.DelimitedPayloadTokenFilter
-
- termAtt - Variable in class org.apache.lucene.analysis.sinks.DateRecognizerSinkFilter
-
- TernaryTree - Class in org.apache.lucene.analysis.compound.hyphenation
-
Ternary Search Tree.
- TernaryTree.Iterator - Class in org.apache.lucene.analysis.compound.hyphenation
-
- TernaryTree.Iterator() - Constructor for class org.apache.lucene.analysis.compound.hyphenation.TernaryTree.Iterator
-
- ThaiAnalyzer - Class in org.apache.lucene.analysis.th
-
Analyzer
for Thai language.
- ThaiAnalyzer() - Constructor for class org.apache.lucene.analysis.th.ThaiAnalyzer
-
- ThaiAnalyzer(Version) - Constructor for class org.apache.lucene.analysis.th.ThaiAnalyzer
-
- ThaiWordFilter - Class in org.apache.lucene.analysis.th
-
TokenFilter
that use BreakIterator
to break each
Token that is Thai into separate Token(s) for each Thai word.
- ThaiWordFilter(TokenStream) - Constructor for class org.apache.lucene.analysis.th.ThaiWordFilter
-
- TOKEN_SEPARATOR - Static variable in class org.apache.lucene.analysis.shingle.ShingleFilter
-
The string to use when joining adjacent tokens to form a shingle
- TokenOffsetPayloadTokenFilter - Class in org.apache.lucene.analysis.payloads
-
Adds the Token.setStartOffset(int)
and Token.setEndOffset(int)
First 4 bytes are the start
- TokenOffsetPayloadTokenFilter(TokenStream) - Constructor for class org.apache.lucene.analysis.payloads.TokenOffsetPayloadTokenFilter
-
- TokenRangeSinkFilter - Class in org.apache.lucene.analysis.sinks
-
Counts the tokens as they go by and saves to the internal list those between the range of lower and upper, exclusive of upper
- TokenRangeSinkFilter(int, int) - Constructor for class org.apache.lucene.analysis.sinks.TokenRangeSinkFilter
-
- TokenRangeSinkTokenizer - Class in org.apache.lucene.analysis.sinks
-
- TokenRangeSinkTokenizer(int, int) - Constructor for class org.apache.lucene.analysis.sinks.TokenRangeSinkTokenizer
-
Deprecated.
- TokenRangeSinkTokenizer(int, int, int) - Constructor for class org.apache.lucene.analysis.sinks.TokenRangeSinkTokenizer
-
Deprecated.
- tokens - Variable in class org.apache.lucene.analysis.compound.CompoundWordTokenFilterBase
-
- tokenStream(String, Reader) - Method in class org.apache.lucene.analysis.ar.ArabicAnalyzer
-
Creates a TokenStream
which tokenizes all the text in the provided Reader
.
- tokenStream(String, Reader) - Method in class org.apache.lucene.analysis.br.BrazilianAnalyzer
-
Creates a TokenStream
which tokenizes all the text in the provided Reader
.
- tokenStream(String, Reader) - Method in class org.apache.lucene.analysis.cjk.CJKAnalyzer
-
Creates a TokenStream
which tokenizes all the text in the provided Reader
.
- tokenStream(String, Reader) - Method in class org.apache.lucene.analysis.cn.ChineseAnalyzer
-
Creates a TokenStream
which tokenizes all the text in the provided Reader
.
- tokenStream(String, Reader) - Method in class org.apache.lucene.analysis.cz.CzechAnalyzer
-
Creates a TokenStream
which tokenizes all the text in the provided Reader
.
- tokenStream(String, Reader) - Method in class org.apache.lucene.analysis.de.GermanAnalyzer
-
Creates a TokenStream
which tokenizes all the text in the provided Reader
.
- tokenStream(String, Reader) - Method in class org.apache.lucene.analysis.el.GreekAnalyzer
-
Creates a TokenStream
which tokenizes all the text in the provided Reader
.
- tokenStream(String, Reader) - Method in class org.apache.lucene.analysis.fa.PersianAnalyzer
-
Creates a TokenStream
which tokenizes all the text in the provided
Reader
.
- tokenStream(String, Reader) - Method in class org.apache.lucene.analysis.fr.FrenchAnalyzer
-
Creates a TokenStream
which tokenizes all the text in the provided
Reader
.
- tokenStream(String, Reader) - Method in class org.apache.lucene.analysis.nl.DutchAnalyzer
-
Creates a TokenStream
which tokenizes all the text in the
provided Reader
.
- tokenStream(String, Reader) - Method in class org.apache.lucene.analysis.query.QueryAutoStopWordAnalyzer
-
- tokenStream(String, Reader) - Method in class org.apache.lucene.analysis.ru.RussianAnalyzer
-
Creates a TokenStream
which tokenizes all the text in the
provided Reader
.
- tokenStream(String, Reader) - Method in class org.apache.lucene.analysis.shingle.ShingleAnalyzerWrapper
-
- tokenStream(String, Reader) - Method in class org.apache.lucene.analysis.th.ThaiAnalyzer
-
- TokenTypeSinkFilter - Class in org.apache.lucene.analysis.sinks
-
- TokenTypeSinkFilter(String) - Constructor for class org.apache.lucene.analysis.sinks.TokenTypeSinkFilter
-
- TokenTypeSinkTokenizer - Class in org.apache.lucene.analysis.sinks
-
- TokenTypeSinkTokenizer(String) - Constructor for class org.apache.lucene.analysis.sinks.TokenTypeSinkTokenizer
-
Deprecated.
- TokenTypeSinkTokenizer(int, String) - Constructor for class org.apache.lucene.analysis.sinks.TokenTypeSinkTokenizer
-
Deprecated.
- TokenTypeSinkTokenizer(List, String) - Constructor for class org.apache.lucene.analysis.sinks.TokenTypeSinkTokenizer
-
Deprecated.
- toLowerCase(char, char[]) - Static method in class org.apache.lucene.analysis.el.GreekCharsets
-
Deprecated.
- toLowerCase(char, char[]) - Static method in class org.apache.lucene.analysis.ru.RussianCharsets
-
Deprecated.
- toString() - Method in class org.apache.lucene.analysis.compound.hyphenation.Hyphen
-
- toString() - Method in class org.apache.lucene.analysis.shingle.ShingleMatrixFilter.Matrix.Column.Row
-
- toString() - Method in class org.apache.lucene.analysis.shingle.ShingleMatrixFilter.Matrix.Column
-
- toString() - Method in class org.apache.lucene.analysis.shingle.ShingleMatrixFilter.Matrix
-
- trimToSize() - Method in class org.apache.lucene.analysis.compound.hyphenation.ByteVector
-
- trimToSize() - Method in class org.apache.lucene.analysis.compound.hyphenation.CharVector
-
- trimToSize() - Method in class org.apache.lucene.analysis.compound.hyphenation.TernaryTree
-
Each node stores a character (splitchar) which is part of some key(s).
- TypeAsPayloadTokenFilter - Class in org.apache.lucene.analysis.payloads
-
Makes the Token.type()
a payload.
- TypeAsPayloadTokenFilter(TokenStream) - Constructor for class org.apache.lucene.analysis.payloads.TypeAsPayloadTokenFilter
-