Skip to content

Conversation

@joegallo
Copy link
Contributor

@joegallo joegallo commented Sep 16, 2021

In practice, the wrapped innerMap is either empty or a HashMap, so we can get much better performance here by delegating to it for these methods -- as opposed to using the default implementations from AbstractMap (javadoc link) which do a linear scan through via an iterator() of the entrySet().

Totally unscientific (because I'm using it for the first time) JMH results from before:

Benchmark                                           Mode  Cnt     Score    Error  Units
LifecycleExecutionStateBenchmark.fromIndexMetadata  avgt   30  1358.084 ± 34.200  ns/op

and after:

Benchmark                                           Mode  Cnt    Score   Error  Units
LifecycleExecutionStateBenchmark.fromIndexMetadata  avgt   30  332.168 ± 5.678  ns/op

This ends up mattering because LifecycleExecutionState#fromIndexMetadata ends up loading all the keys it's interested in by way of LifecycleExecutionState#fromCustomMetadata and there's a lot of them. The customData map there is a DiffableStringMap, so we're doing a scan through all the keys in fromCustomMetadata invoking get each time, and get itself was doing a further scan through all the keys (well, half the keys, on average) internally, so fromIndexMetadata was accidentally quadratic.

In practice, the wrapped innerMap is either empty or a HashMap, so we
can get much better performance here by delegating to it for these
methods -- as opposed to using the default implementations from
AbstractMap which do a linear scan through via an iterator() of the
entrySet().
@joegallo joegallo added >bug :Data Management/ILM+SLM Index and Snapshot lifecycle management v8.0.0 v7.16.0 labels Sep 16, 2021
@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Sep 16, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

Copy link
Contributor

@probakowski probakowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @joegallo!

@joegallo joegallo merged commit 6c4c5e2 into elastic:master Sep 17, 2021
@joegallo joegallo deleted the accidentally-quadratic branch September 17, 2021 10:54
wjp719 added a commit to wjp719/elasticsearch that referenced this pull request Sep 18, 2021
* master: (185 commits)
  Implement get and containsKey in terms of the wrapped innerMap (elastic#77965)
  Adjust Lucene version and enable BWC tests (elastic#77933)
  Disable BWC to upgrade to Lucene-8.10-snapshot
  Reenable MlDistributedFailureIT
  [DOCS] Fix typo for `script.painless.regex.enabled` setting value (elastic#77853)
  Upgrade to Lucene-8.10.0-snapshot-bf2fcb53079 (elastic#77801)
  [DOCS] Fix ESS install lead-in (elastic#77887)
  Resolve thirdparty gradle plugin artifacts from mavencentral (elastic#77865)
  Reduce the number of times that `LifecycleExecutionState` is parsed when running a policy. (elastic#77863)
  Utility methods to add and remove backing indices from data streams (elastic#77778)
  Use Objects.equals() instead of == to compare strings (elastic#77840)
  [ML] prefer least allocated model when a new node is added to the cluster (elastic#77756)
  Deprecate ignore_throttled parameter (elastic#77479)
  Improve LifecycleExecutionState parsing. (elastic#77855)
  [DOCS] Removes deprecated word from HLRC title. (elastic#77851)
  Remove legacy geo code from AggregationResultUtils (elastic#77702)
  Adjust SearchableSnapshotsBlobStoreCacheIntegTests.testBlobStoreCache (elastic#77758)
  Laxify SecureSM to allow creation of the JDK's innocuous threads (elastic#77789)
  [Test] Reduce concurrency when testing creation of security index (elastic#75293)
  Refactor metric PipelineAggregation integration test (elastic#77548)
  ...

# Conflicts:
#	server/src/main/java/org/elasticsearch/cluster/metadata/DataStream.java
@martijnvg
Copy link
Member

Thanks @joegallo, this is a great fix! Maybe also backport this change to the 7.15 branch?

@joegallo
Copy link
Contributor Author

Thanks @joegallo, this is a great fix! Maybe also backport this change to the 7.15 branch?

✅, 2fdd5dc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>bug :Data Management/ILM+SLM Index and Snapshot lifecycle management Team:Data Management Meta label for data/management team v7.15.1 v7.16.0 v8.0.0-beta1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants