[Docs] Add note about maximum token length for whitespace tokenizer #26721

cbuescher · 2017-09-20T11:25:07Z

The whitespace tokenizer splits tokens longer than 255 characters into multiple tokens,
which can lead to confusing search matches like the one observed in #26601. This adds
a note to the documentation to make this clearer.

Closes #26641

cbuescher · 2017-09-25T21:52:03Z

This clarifies the docs in 5.6.x and 6.0 while starting with 6.1 we will have support for an overwrite of the "max_token_length" parameter via #26643.

…26721)

cbuescher added :Search Relevance/Analysis How text is split into tokens >docs General docs changes review v5.6.2 v6.0.0 v6.1.0 v7.0.0 labels Sep 20, 2017

cbuescher changed the title ~~[Docs] Add not about maximum token length for whitespace tokenizer~~ [Docs] Add note about maximum token length for whitespace tokenizer Sep 20, 2017

colings86 added v5.6.3 and removed v5.6.2 labels Sep 21, 2017

cbuescher removed v6.1.0 v7.0.0 labels Sep 25, 2017

cbuescher changed the base branch from master to 6.0 September 25, 2017 21:45

[Docs] Add not about maximum token length for whitespace tokenizer

5d1627c

cbuescher force-pushed the docs-addNote-WhitespaceTokenizer branch from 4a2d047 to 5d1627c Compare September 25, 2017 21:49

jpountz approved these changes Oct 6, 2017

View reviewed changes

javanna added v5.6.4 and removed v5.6.3 labels Oct 6, 2017

cbuescher merged commit f098553 into elastic:6.0 Oct 7, 2017

cbuescher added a commit that referenced this pull request Oct 7, 2017

[Docs] Add not about maximum token length for whitespace tokenizer (#…

f9e5529

…26721)

javanna added v5.6.3 and removed v5.6.4 labels Oct 9, 2017

lcawl added v6.0.0-rc2 and removed v6.0.0 labels Oct 30, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Docs] Add note about maximum token length for whitespace tokenizer #26721

[Docs] Add note about maximum token length for whitespace tokenizer #26721

Uh oh!

cbuescher commented Sep 20, 2017

Uh oh!

cbuescher commented Sep 25, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[Docs] Add note about maximum token length for whitespace tokenizer #26721

[Docs] Add note about maximum token length for whitespace tokenizer #26721

Uh oh!

Conversation

cbuescher commented Sep 20, 2017

Uh oh!

cbuescher commented Sep 25, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants