Skip to content

Fork tdigest library #95903

@martijnvg

Description

@martijnvg

We plan to for the tdigest library.

There are two main reasons behind this choice:

  1. We would like to control semantic version and backward compatibility according to our definition. Right now, for instance, TDigest does not match our usage of semantic versioning when changing the library code and that makes upgrading quite challenging because exposes us to backward compatibility issues.
  2. We would like to change those libraries to use some specific Elasticsearch libraries/tools/frameworks such as BigArrays. Right now when running some aggregations (percentiles, boxplot,...) we experience OOMs due to large memory usage. Using BigArrays, for instance, would allow us to deal with OOMs using Circuit Breakers.

The immediate goal of this issue is to fork the library and then at a later stage enhance to forked library to make use of the big arrays infrastructure.

Currently t-digest version 3.2 is used. The current version is 3.3 We have been locked to the 3.2 version because of at least one breaking change (how p50 is computed). The plan is to fork from the latest commit and change the forked library such that the results it produces are similar to that of version 3.2.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions