Skip to content

Highlighter breaks phrases #29561

@jacool

Description

@jacool

Elasticsearch version (bin/elasticsearch --version):
6.2.3

Plugins installed: []

JVM version (java -version):
openjdk version "1.8.0_161"

OS version (uname -a if on a Unix-like system):
Linux 5137c3a21142 4.9.87-linuxkit-aufs

Description of the problem including expected versus actual behavior:
Highlighter breaks searched phrases into separate highlights - makes the highlighter results quite annoying to a user. In the example below the expected highlight would look like this:
shuffled off <em>this mortal coil</em>, must give us
Notice, while the Unified highlighter has this issue the FVH highlighter behaves according to the expectation.

Steps to reproduce:

PUT /test
{
  "mappings": {
    "t": {
      "properties": {
        "message": {
          "type": "text",
          "term_vector": "with_positions_offsets"
        }
      }
    }
  }
}

POST /test/t/1
{
    "message": "What dreams may come, when we have shuffled off this mortal coil, must give us pause."
}

GET /test/_search
{
  "version": true,
  "query": {
    "match_phrase": {
      "message": {
        "query": "this mortal coil"
      }
    }
  },
  "highlight": {
    "fields": {
      "message": {
        "type": "unified",
        "fragment_size": 40
      }
    }
  }
}

This results in the following highlighting, which is practically unusable:

"highlight": {
  "message": [
     "dreams may come, when we have shuffled off <em>this</em>",
     "<em>mortal</em> <em>coil</em>, must give us pause."
   ]
}

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions