-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Open
Labels
:Delivery/PackagingRPM and deb packaging, tar and zip archives, shell and batch scriptsRPM and deb packaging, tar and zip archives, shell and batch scripts>enhancementTeam:DeliveryMeta label for Delivery teamMeta label for Delivery teamdiscuss
Description
As discussed in #81208, using Cloudflare's zlib implementation improves performance in the following cases:
- Compression of stored fields with
index.codec: best_compression,
which we use by default for observability and security integrations' data. - Request / response compression.
- (untested, but presumed) Transport compression when using
transport.compression_scheme: deflate(not the default for local transport, but is the default for remote clusters when compression is enabled as of Remote compression scheme default to deflate #76580)
Whereas #81208 scoped the initial approach to our Docker image, where we control both the distribution and the environment, this issue asks if we can go one step further and bundle it in our tarballs. Some general notes about why this is worth considering:
- When Use Cloudflare's zlib in Docker images #81245 was implemented, it steered the concern of linking the alternate zlib into the
elasticsearchscript, alleviating need to control the environment variables externally. - Use Cloudflare's zlib in Docker images #81245 added a build step to the Dockerfile for the cloudflare-zlib library. If we could instead host or fetch the desired library we could remove a build step.
If instead we packaged an improved zlib in the Elasticsearch distribution, we could reap the following advantages:
- Potential for improved performance for more users/use-cases
- New-normal performance for
best_compression, which can be considered for more uses potentially if it comes with a lower expected overhead - The change would be applicable to our nightly benchmarking environment, which currently runs the default linux distributions (this may change to a docker-based environment at some point but planning has not begun on that).
- We would not need to re-build the library every time we build the Docker image (or maybe just every time we update an earlier layer? I don't know much about what our build environment has cached typically).
Disadvantages:
- The library isn't hosted anywhere, we would have to build and host it.
- In order to mitigate security vulnerabilities, we would need to keep it up-to-date either by monitoring versions as we currently do for JVM releases (if we choose a library with binary releases) or periodically/automatically build new versions ourselves to bundle with our distributions
- Likewise, CVEs which affect the zlib implementation we choose are inextricably linked to our releases, meaning instead of patching the system independently we would need to produce patched releases for any supported version of Elasticsearch (alternatively mitigation functionality/notices to just have the user fall back to system zlib)
- amd64/aarch64 support is implied to not be optimized (see: this blog)
We recently brought up this issue in a Fix-It meeting, and open questions included:
- What implementation should we prefer? There are a multitude of choices. A nice option is
zlib-ng, which is actively maintained and spells out a few of its own advantages versus Cloudflare nicely here: 2.0.0 Benchmark comparisons zlib-ng/zlib-ng#871 (comment) - Do we need it in the tarball if it's already in the Docker image? Currently the Cloudflare-zlib will be activated for Cloud customers, as well as users of ECE, ECK, and the Docker image. Currently we have a broad install base of self-managed Elasticsearch, plus this the default tarball distribution is the one that currently gets targeted in our benchmarks (such as the nightly/release results published at https://elasticsearch-benchmarks.elastic.co)
ibaldonl
Metadata
Metadata
Assignees
Labels
:Delivery/PackagingRPM and deb packaging, tar and zip archives, shell and batch scriptsRPM and deb packaging, tar and zip archives, shell and batch scripts>enhancementTeam:DeliveryMeta label for Delivery teamMeta label for Delivery teamdiscuss