Skip to content

Aggregations bug: Significant_text fails on arrays of text #25029

@markharwood

Description

@markharwood

Given a document with a field that has arrays of free text values the significant_text aggregation treats each array element as a separate document when it comes to counting doc frequencies of terms which can lead to this error from the significance heuristic:

      "type": "illegal_argument_exception",
      "reason": "subsetFreq > subsetSize, in JLHScore"

This is down to a bug in where the set of previously-seen tokens is allocated and destroyed.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions