Skip to content

Pattern ". - " breaks hightlighted result returned by search #118251

@Yolo-plop

Description

@Yolo-plop

Elasticsearch Version

8.16.1

Installed Plugins

No response

Java Version

bundled

OS Version

Debian 11.11

Problem Description

I have found a specific pattern in my search engine that breaks the result of highlighted fields returned. I have researched it and thought upgrading to Elasticsearch 8 (I was on 7) would solve it due to this #96068 but it's not the case.

The pattern is {something1}. - {something} (for example My item R. - 1000x - 4 gr). When searching for part of the phrase after the hypen (1000x), everything before the dot is not returned. So the result I get it ". - 1000x - gr".

Steps to Reproduce

Creating the index:

curl -X PUT "http://localhost:9200/test-highlight" -H 'Content-Type: application/json' -d'
{
  "mappings": {
    "properties": {
      "label": {
        "type": "text"
      }
    }
  }
}
'

Inserting document:

curl -X POST "http://localhost:9200/test-highlight/_doc" -H 'Content-Type: application/json' -d'
{
  "label": "My item R. - 1000x- 4 gr"
}
'

Searching document with highlights:

curl -X POST "http://localhost:9200/test-highlight/_search" -H 'Content-Type: application/json' -d'
{
  "query": {
    "match": {
      "label": "1000x"
    }
  },
  "highlight": {
    "fields": {
      "label": {}
    }
  }
}
'

Result is:


{"took":17,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":1,"relation":"eq"},"max_score":0.71566814,"hits":[{"_index":"test-highlight","_id":"DOb1qpMBZyt-o2ZAR9r9","_score":0.71566814,"_source":
{
  "label": "My item R. - 1000x - 4 gr"
}
,"highlight":{"label":[". - <em>1000x</em> - 4 gr"]}}]}}

Logs (if relevant)

No response

Metadata

Metadata

Labels

:Search Relevance/HighlightingHow a query matched a document>bugTeam:Search RelevanceMeta label for the Search Relevance team in Elasticsearchneeds:riskRequires assignment of a risk label (low, medium, blocker)

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions