-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Description
Elasticsearch version:
5.2.2 (I reproduced this issue on the 5.2.2 docker image provided by elastic.co)
Plugins installed: [x-pack:5.2.2]
(is installed by default in the docker image)
JVM version:
"version": "1.8.0_92-internal",
"vm_name": "OpenJDK 64-Bit Server VM",
"vm_vendor": "Oracle Corporation",
"vm_version": "25.92-b14"
OS version:
the os of the docker image is a linux, it runs on docker4mac 17.03.0-ce-mac2 (15657) on mac osX sierra
Description of the problem including expected versus actual behavior:
Running the following request on a very small index (see reproduction steps) is consistently very slow:
GET season/season/_search
{
"query" : {
"bool" : {
"must" : [
{
"match" : {
"name.classic" : {
"query" : "RENAULT Talisman Talisman Estate 1.6 dCi 130 225/55 R17 101W winter",
"operator" : "OR",
"fuzziness" : "AUTO",
"prefix_length" : 0,
"max_expansions" : 50,
"fuzzy_transpositions" : true,
"lenient" : false,
"zero_terms_query" : "NONE",
"boost" : 1.0
}
}
}
],
"disable_coord" : false,
"adjust_pure_negative" : true,
"boost" : 1.0
}
}
}
the previous query returns in over 3 seconds :
"took": 4360,
when disabling the fuzzyness the query runs orders of magnitude faster
"took": 87,
disabling shingling in the query analysis also runs orders of magnitude faster
"took": 11,
Steps to reproduce:
DELETE season
PUT season
{
"settings": {
"index": {
"analysis": {
"filter": {
"search_shingler": {
"max_shingle_size": "3",
"min_shingle_size": "2",
"token_separator": " ",
"output_unigrams": "true",
"filler_token": "_",
"output_unigrams_if_no_shingles": "false",
"type": "shingle"
}
},
"analyzer": {
"seasons-searchAnalyzer": {
"filter": [
"asciifolding",
"lowercase",
"search_shingler"
],
"type": "custom",
"tokenizer": "whitespace"
},
"seasons-indexAnalyzer": {
"filter": [
"asciifolding",
"lowercase"
],
"type": "custom",
"tokenizer": "keyword"
}
}
}
}
}
}
PUT season/_mapping/season
{
"properties": {
"name": {
"type": "text",
"fields": {
"classic": {
"type": "text",
"analyzer": "seasons-indexAnalyzer",
"search_analyzer": "seasons-searchAnalyzer"
},
"raw": {
"type": "keyword"
}
}
},
"value": {
"type": "keyword"
}
}
}
POST season/season
{ "name": "winter", "value": "Winter"}
POST season/season
{ "name": "winteur", "value": "Winter"}
POST season/season
{ "name": "summer", "value": "Summer"}
POST season/season
{ "name": "all seasons", "value": "AllSeasons"}
POST season/season
{ "name": "nordic", "value": "Nordic" }
Provide logs (if relevant):
I couldn't find anything relevant in the logs (I only get deleting index, creating index, adding mapping, etc... )
comparison with previous versions
I tried running the exact same scenario on our previous configuration (es 1.7) the query with both fuzzy and shingling ran in about 150ms. we never tried intermediate versions as our upgrade path was blocked by the removal of FLT until #9103 got merged.