Skip to content

Conversation

@jpountz
Copy link
Contributor

@jpountz jpountz commented Aug 8, 2016

At index time Elasticsearch needs to look up the version associated with the
_uid of the document that is being indexed, which is often the bottleneck for
indexing.

While reviewing the output of the jfr telemetry from a Rally benchmark, I saw
that significant time was spent in ConcurrentHashMap#get and ThreadLocal#get.
The reason is that we cache lookup objects per thread and segment, and for every
indexed document, we first need to look up the cache associated with this
segment (ConcurrentHashMap#get) and then get a state that is local to the
current thread (ThreadLocal#get). So if you are indexing N documents per
second and have S segments, both these methods will be called N*S times per
second.

This commit changes version lookup to use a cache per index reader rather than
per segment. While this makes cache entries live for less long, we now only need
to do one call to ConcurrentHashMap#get and ThreadLocal#get per indexed
document.

NOTE: I had to remove IndexReader#getCombinedCoreAndDeletesKey from the
forbidden APIs in order to make it work.

Here are some screenshots of the hot methods as reported when indexing 100M empty documents with rally using 1 shard, 0 replicas and otherwise default settings.
Before:
before
After:
after
ThreadLocal#get is no longer among the hottest methods after the change, and while ConcurrentHashMap#get is still there, its main callers are now the live version map and the circuit breaker, not the state we maintain for version lookups in the index.

Here is the report from rally compare on two races that have been run without telemetry (so that it does not affect the benchmark results). The speedup looks real.

                             Metric    Baseline    Contender                Diff
-----------------------------------  ----------  -----------  ------------------
   Min Indexing Throughput [docs/s]       67415        73665  +6250.00000
Median Indexing Throughput [docs/s]       73608      80540.3  +6932.33333
   Max Indexing Throughput [docs/s]       75151        81799  +6648.00000
                Indexing time [min]     113.854      94.9646    -18.88890
                   Merge time [min]     15.0901      11.0337     -4.05640
                 Refresh time [min]     1.52247      1.57633     +0.05387
                   Flush time [min]    0.901483     0.874733     -0.02675
          Merge throttle time [min]     5.95868       2.5531     -3.40558
       Median CPU usage (index) [%]     566.462      568.981     +2.51969
             Total Young Gen GC [s]     113.929      116.784     +2.85500
               Total Old Gen GC [s]      18.633       19.979     +1.34600
                    Index size [GB]     1.88654      1.87796     -0.00858
               Totally written [GB]     20.4271      19.9763     -0.45073
                      Segment count          22           34    +12.00000

@jpountz
Copy link
Contributor Author

jpountz commented Aug 9, 2016

This is basically a revert of #14070. I'm stalling this on making cache keys / closed listeners less trappy.

@jpountz jpountz added stalled and removed review labels Aug 9, 2016
@clintongormley clintongormley added the :Core/Infra/Core Core issues without another label label Aug 11, 2016
@elasticmachine
Copy link
Collaborator

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

@rjernst
Copy link
Member

rjernst commented Jun 9, 2017

@jpountz Is this PR worth reviving?

@jpountz
Copy link
Contributor Author

jpountz commented Jun 9, 2017

Absolutely!

At index time Elasticsearch needs to look up the version associated with the
`_id` of the document that is being indexed, which is often the bottleneck for
indexing.

While reviewing the output of the `jfr` telemetry from a Rally benchmark, I saw
that significant time was spent in `ConcurrentHashMap#get` and `ThreadLocal#get`.
The reason is that we cache lookup objects per thread and segment, and for every
indexed document, we first need to look up the cache associated with this
segment (`ConcurrentHashMap#get`) and then get a state that is local to the
current thread (`ThreadLocal#get`). So if you are indexing N documents per
second and have S segments, both these methods will be called N*S times per
second.

This commit changes version lookup to use a cache per index reader rather than
per segment. While this makes cache entries live for less long, we now only need
to do one call to `ConcurrentHashMap#get` and `ThreadLocal#get` per indexed
document.
@jpountz jpountz force-pushed the enhancement/faster_version_lookups branch from 896fa1f to fc1db9e Compare June 14, 2017 18:14
@jpountz jpountz removed the stalled label Jun 14, 2017
@jpountz jpountz requested a review from s1monw June 14, 2017 18:16
@jpountz
Copy link
Contributor Author

jpountz commented Jun 14, 2017

I rebased this PR!

Copy link
Contributor

@s1monw s1monw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left a small comment LGTM otherwise

IndexReader.CacheHelper cacheHelper = reader.getCoreCacheHelper();
CloseableThreadLocal<PerThreadIDVersionAndSeqNoLookup> ctl = lookupStates.get(cacheHelper.getKey());
private static PerThreadIDVersionAndSeqNoLookup[] getLookupState(IndexReader reader, String uidField) throws IOException {
// We cache on the top level
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you leave a comment why we moved to this? you can also reference this PR for documentation purposes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 I had more comments initially but they apparently got lost with the rebase

@jpountz jpountz merged commit 5a6fa62 into elastic:master Jun 15, 2017
@jpountz jpountz deleted the enhancement/faster_version_lookups branch June 15, 2017 08:17
@jpountz jpountz added the v6.0.0 label Jun 15, 2017
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Jun 15, 2017
* master:
  Upgrade icu4j for the ICU analysis plugin to 59.1 (elastic#25243)
  move assertBusy to use CheckException (elastic#25246)
  Use SPI in High Level Rest Client to load XContent parsers (elastic#25098)
  [TEST] test that low level REST client leaves path untouched (elastic#25193)
  Speed up PK lookups at index time. (elastic#19856)
  [Docs] Fix documentation for percentiles bucket aggregation (elastic#25229)
  Upgrade to lucene-7.0.0-snapshot-92b1783. (elastic#25222)
  Build: Add master flag for disabling bwc tests (elastic#25230)
  Scripting: Rename SearchScript.needsScores to needs_score (elastic#25235)
  Support script context stateful factory in Painless. (elastic#25233)
  FastVectorHighlighter should not cache the field query globally (elastic#25197)
  Remove QUERY_AND_FETCH BWC for pre-5.3.0 nodes (elastic#25223)
  Add more missing AggregationBuilder getters (elastic#25198)
  Extract the snapshot/restore full cluster restart tests from the translog full cluster restart tests (elastic#25204)
jasontedor added a commit to glefloch/elasticsearch that referenced this pull request Jun 15, 2017
* master: (44 commits)
  Upgrade icu4j for the ICU analysis plugin to 59.1 (elastic#25243)
  move assertBusy to use CheckException (elastic#25246)
  Use SPI in High Level Rest Client to load XContent parsers (elastic#25098)
  [TEST] test that low level REST client leaves path untouched (elastic#25193)
  Speed up PK lookups at index time. (elastic#19856)
  [Docs] Fix documentation for percentiles bucket aggregation (elastic#25229)
  Upgrade to lucene-7.0.0-snapshot-92b1783. (elastic#25222)
  Build: Add master flag for disabling bwc tests (elastic#25230)
  Scripting: Rename SearchScript.needsScores to needs_score (elastic#25235)
  Support script context stateful factory in Painless. (elastic#25233)
  FastVectorHighlighter should not cache the field query globally (elastic#25197)
  Remove QUERY_AND_FETCH BWC for pre-5.3.0 nodes (elastic#25223)
  Add more missing AggregationBuilder getters (elastic#25198)
  Extract the snapshot/restore full cluster restart tests from the translog full cluster restart tests (elastic#25204)
  Refactor TransportShardBulkAction.executeUpdateRequest and add tests
  Make sure range queries are correctly profiled. (elastic#25108)
  Test: allow setting socket timeout for rest client (elastic#25221)
  Migration docs for elastic#25080 (elastic#25218)
  Remove `discovery.type` BWC layer from the EC2/Azure/GCE plugins elastic#25080
  When stopping via systemd only kill the JVM, not its control group (elastic#25195)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Core/Infra/Core Core issues without another label >enhancement v6.0.0-beta1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants