Skip to content

Unified highlighter: include additional context outside of highlighted sentence to reach target fragment_size #28089

@marshalium

Description

@marshalium

Describe the feature

Currently, the unified highlighter can only provide context by including the sentence the highlighted word is in. This is sometimes a very short highlight. For example, given text in a field like this:

Some leading context. A short sentence. Some more content. And even more context around that sentence.

Running a query for the term sentence using the unified highlighter and fragment_size set to 300, results in a highlight that, while it includes the word that we're looking for, does not provide much context and is nowhere close to the target size requested:

A short <em>sentence</em>.

In contrast, run the same query with the plain highlighter results in a highlight with much more useful context (and in this case another highlighted word!):

Some leading context. A short <em>sentence</em>. Some more content. And even more context around that <em>sentence</em>.

The unified highlighter should include as much context as possible without going over the target fragment size. This will result in more consistently sized highlights (which is nice for visual consistency) and will provide more useful context in cases where the highlight occurs in a short sentence.

cc: @colings86

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions