-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Closed
Labels
:Search Relevance/HighlightingHow a query matched a documentHow a query matched a document>bugTeam:Search RelevanceMeta label for the Search Relevance team in ElasticsearchMeta label for the Search Relevance team in Elasticsearch
Description
Elasticsearch version:
elasticsearch-6.7.1
JVM version:
java version "1.8.0_201"
OS version:
17.7.0 Darwin Kernel Version 17.7.0
Expected/ 5.x behaviour
As per the documentation:
If the number of fragments is set to 0, no fragments are returned. Instead, the entire field contents are highlighted and returned.
This was the behaviour on 5.x where the respective highlight field for the original field was returned in full (up to the no_match_size).
The behaviour on 6.x is that only a partial of the field is returned.
Steps to reproduce:
The query command returns different responses on ES5 and ES6
curl -H 'Content-Type: application/json' -XPUT 'http://localhost:9200/test-index?pretty' -d '{"mappings": {
"services": {
"dynamic": "strict",
"properties": {
"someDescription": {
"type": "text"
}
}
}
}
}'
curl -H 'Content-Type: application/json' -XPUT 'http://localhost:9400/test-index/services/1?pretty' -d '{
"someDescription": "Here is a description that is quite long and describes the problem that we have had with the new elasticsearch highlighting engine. The length of this field is 500 characters and the problem. Is that the whole description is not returned as a single fragment. Rather the description is split into its constituent sentances. And the first sentence is returned. Those thereafter are not. This was not the behaviour in elasticsearch 5 where the whole field would be returned but has been introduced in es6."
}'
curl -H 'Content-Type: application/json' -XGET 'http://localhost:9200/test-index/services/_search?pretty&search_type=dfs_query_then_fetch' -d '{
"highlight": {
"encoder": "html",
"fields": {
"someDescription": {
"no_match_size": 500,
"number_of_fragments": 0
}
},
"post_tags": [
"</mark>"
],
"pre_tags": [
"<mark class=\u0027search-result-highlighted-text\u0027>"
]
},
"query": {
"match_all": {}
},
"size": 100
}'
curl -XDELETE 'http://localhost:9200/test-*?pretty' -d ''The response on ES5:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 1.0,
"hits" : [
{
"_index" : "test-index",
"_type" : "services",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"someDescription" : "Here is a description that is quite long and describes the problem that we have had with the new elasticsearch highlighting engine. The length of this field is 500 characters and the problem. Is that the whole description is not returned as a single fragment. Rather the description is split into its constituent sentances. And the first sentence is returned. Those thereafter are not. This was not the behaviour in elasticsearch 5 where the whole field would be returned but has been introduced in es6."
},
"highlight" : {
"someDescription" : [
"Here is a description that is quite long and describes the problem that we have had with the new elasticsearch highlighting engine. The length of this field is 500 characters and the problem. Is that the whole description is not returned as a single fragment. Rather the description is split into its constituent sentances. And the first sentence is returned. Those thereafter are not. This was not the behaviour in elasticsearch 5 where the whole field would be returned but has been introduced in"
]
}
}
]
}
}
The response on ES6:
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 1.0,
"hits" : [
{
"_index" : "test-index",
"_type" : "services",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"someDescription" : "Here is a description that is quite long and describes the problem that we have had with the new elasticsearch highlighting engine. The length of this field is 500 characters and the problem. Is that the whole description is not returned as a single fragment. Rather the description is split into its constituent sentances. And the first sentence is returned. Those thereafter are not. This was not the behaviour in elasticsearch 5 where the whole field would be returned but has been introduced in es6."
},
"highlight" : {
"someDescription" : [
"Here is a description that is quite long and describes the problem that we have had with the new elasticsearch highlighting engine."
]
}
}
]
}
}
Metadata
Metadata
Assignees
Labels
:Search Relevance/HighlightingHow a query matched a documentHow a query matched a document>bugTeam:Search RelevanceMeta label for the Search Relevance team in ElasticsearchMeta label for the Search Relevance team in Elasticsearch