Review TokenFilterFactory.getSynonymFilter() implementations

#33702 introduced a new method on the `TokenFilterFactory` interface allowing filters to return specialized versions of themselves for synonym parsing.  Currently only the `multiplexer` implements this, to return the original token.  We should review all the filter factories shipped with elasticsearch to see if any others need changing.

[] AsciiFoldingFilter -> should return only the folded token
[] CJKBigramFilter -> ignore if emitUnigrams = true
[] CommonGramsTokenFilter -> ignore
[] CompoundWordTokenFilterBase -> ??
[] EdgeNGramTokenFilter -> ??
[] Fingerprint & MinHash -> shouldn't be used with synonyms anyway...
[] Keyword repeat -> ignore
[] NGramTokenFilter -> ??
[] Shingle -> shouldn't output unigrams
[] SynonymGraph & Synonym -> should we allow multiple synonym chains?
[] Phonetic -> ignore
[] WordDelimiterGraph & WordDelimiter -> ??

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Review TokenFilterFactory.getSynonymFilter() implementations #34298

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Review TokenFilterFactory.getSynonymFilter() implementations #34298

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions