Skip to content

Enrich policy execution can be a heavy burden on the master #70436

@DaveCTurner

Description

@DaveCTurner

Today when an enrich policy is executed, building a new copy of the enrich index, we use the elected master to coordinate the underlying reindex task which operates in batches of 10,000 documents by default. A user reported that a particular enrich policy execution would reliably cause their master to fail with an OutOfMemoryError. The policy in question was of geo_match type and the documents, containing geoshapes, could be quite large. Their master had a 1GB heap, which is appropriate for their small and well-run cluster, but it appears it simply could not hold 10,000 geoshapes in memory on the master at once.

As a workaround they reduced the (undocumented?) enrich.fetch_size to 5000, which was enough to avoid the OOM, but I'm still concerned about the strain this puts on the master.

A few ideas for possible improvements:

  • Move the coordination of the reindexing job onto a different node.
  • Strengthen a circuit-breaker to prevent an OOM like this one.
  • Adapt the reindex batch size to the resources available.

I think we should definitely do the first; the other two are harder.

See https://discuss.elastic.co/t/why-does-an-enrich-policy-get-executed-on-the-master-node/263241 for more details.

/cc @consulthys

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions