From 86971d431831dde0b18bfc339a6531a77707abfd Mon Sep 17 00:00:00 2001
From: Tim Brooks <tim@uncontended.net>
Date: Mon, 13 Jul 2020 13:33:28 -0600
Subject: [PATCH 01/11] Add indexing pressure documentation

This commit adds documentation about the new indexing pressure memory
limit setting and exposure of this metrics in node stats.
---
 docs/reference/cluster/nodes-stats.asciidoc   | 60 +++++++++++++++++++
 .../index-modules/back-pressure.asciidoc      | 55 +++++++++++++++++
 2 files changed, 115 insertions(+)
 create mode 100644 docs/reference/index-modules/back-pressure.asciidoc

diff --git a/docs/reference/cluster/nodes-stats.asciidoc b/docs/reference/cluster/nodes-stats.asciidoc
index 34bfdffc89655..93310b136b95b 100644
--- a/docs/reference/cluster/nodes-stats.asciidoc
+++ b/docs/reference/cluster/nodes-stats.asciidoc
@@ -58,6 +58,10 @@ using metrics.
   `http`::
       HTTP connection information.
 
+  `indexing_pressure`::
+        Indexing pressure statistics about total and current indexing load and
+        indexing rejections.
+
   `indices`::
       Indices stats about size, document count, indexing and deletion times,
       search times, field cache size, merges and flushes.
@@ -2099,6 +2103,62 @@ Number of failed operations for the processor.
 =======
 ======
 
+[[cluster-nodes-stats-api-response-body-indexing-pressure]]
+`indexing_pressure`::
+(object)
+Contains indexing pressure statistics for the node.
++
+.Properties of `indexing_pressure`
+[%collapsible%open]
+======
+`total`::
+(object)
+Contains statistics for cumulative indexing load since the node started.
++
+.Properties of `<circuit_breaker_name>`
+[%collapsible%open]
+=======
+`coordinating_and_primary_bytes`::
+(integer)
+Bytes consumed by indexing requests in the coordinating or primary stage.
+
+`replica_bytes`::
+(integer)
+Bytes consumed by indexing requests in the replica stage.
+
+`all_bytes`::
+(integer)
+Bytes consumed by indexing requests in the coordinating, primary, or replica stage.
+
+`coordinating_and_primary_memory_limit_rejections`::
+(integer)
+Rejections of indexing requests in the coordinating or primary stage.
+
+`replica_memory_limit_rejections`::
+(integer)
+Rejections of indexing requests in the replica stage.
+=======
+`current`::
+(object)
+Contains statistics for current indexing load.
++
+.Properties of `<circuit_breaker_name>`
+[%collapsible%open]
+=======
+`coordinating_and_primary_bytes`::
+(integer)
+Bytes consumed by indexing requests in the coordinating or primary stage.
+
+`replica_bytes`::
+(integer)
+Bytes consumed by indexing requests in the replica stage.
+
+`all_bytes`::
+(integer)
+Bytes consumed by indexing requests in the coordinating, primary, or replica stage.
+=======
+======
+
 [[cluster-nodes-stats-api-response-body-adaptive-selection]]
 `adaptive_selection`::
 (object)
diff --git a/docs/reference/index-modules/back-pressure.asciidoc b/docs/reference/index-modules/back-pressure.asciidoc
new file mode 100644
index 0000000000000..5a6fbe3eb943e
--- /dev/null
+++ b/docs/reference/index-modules/back-pressure.asciidoc
@@ -0,0 +1,55 @@
+[[index-modules-back-pressure]]
+== Indexing Pressure
+
+Indexing documents into Elasticsearch introduces system load in the form of
+memory and CPU load. Each indexing operation includes coordinating, primary, and
+replica steps. These steps can be performed across multiple nodes in the
+cluster. If too much indexing work is introduced into the system, the cluster
+can become saturated.
+
+Indexing pressure is primarily generated by external operations such as indexing
+requests or CCR follow tasks. However, there is some internal indexing pressure
+during a shard recovery or primary failover.
+
+Elasticsearch internally monitors indexing load. When the load exceeds certain
+limits, new indexing work will be rejected.
+
+[float]
+=== Indexing Steps
+
+External indexing operations go through three steps. The node receiving the
+indexing is the coordinating node. In the coordinating step the node performs
+any configured ingest pipelines, separates the request into individual shard
+requests, and dispatches them to the primary shards.
+
+In the primary step, the primary shard node ensures that it is still the primary
+(otherwise it reroutes the request to the new primary), waits on active shards,
+indexes the documents into its translog and Lucene, and dispatches to the
+replicas.
+
+Finally, in the replica step, every available replica node indexes the documents
+into its translog and Lucene.
+
+
+[float]
+=== Memory Limits
+
+Elasticsearch exposes a node setting `indexing_pressure.memory.limit` which
+restricts the number of bytes for outstanding indexing requests. Be default,
+this setting is configured to be 10% of the heap.
+
+A node will start rejecting new indexing work at the coordinating or primary
+step when the number of outstanding coordinating, primary and replica indexing
+requests is greater than the configured limit.
+
+A node will start rejecting new indexing work at the replica step when the
+number of outstanding replica indexing requests is greater than 1.5x the
+configured limit. This design means that as indexing pressure builds on nodes,
+they will naturally stop accepting coordinating and primary work in favor of
+outstanding replica work.
+
+
+[float]
+=== Monitoring
+
+Indexing pressure metrics are exposed by the `GET /_nodes/stats` API.

From 54bc0ffee857dab5c0fc0a1c4f51e54fb50914c6 Mon Sep 17 00:00:00 2001
From: Tim Brooks <tim@uncontended.net>
Date: Mon, 13 Jul 2020 13:34:58 -0600
Subject: [PATCH 02/11] Changes

---
 .../{back-pressure.asciidoc => indexing-pressure.asciidoc}  | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)
 rename docs/reference/index-modules/{back-pressure.asciidoc => indexing-pressure.asciidoc} (95%)

diff --git a/docs/reference/index-modules/back-pressure.asciidoc b/docs/reference/index-modules/indexing-pressure.asciidoc
similarity index 95%
rename from docs/reference/index-modules/back-pressure.asciidoc
rename to docs/reference/index-modules/indexing-pressure.asciidoc
index 5a6fbe3eb943e..a0c40253b5f6f 100644
--- a/docs/reference/index-modules/back-pressure.asciidoc
+++ b/docs/reference/index-modules/indexing-pressure.asciidoc
@@ -1,4 +1,4 @@
-[[index-modules-back-pressure]]
+[[index-modules-indexing-pressure]]
 == Indexing Pressure
 
 Indexing documents into Elasticsearch introduces system load in the form of
@@ -11,8 +11,8 @@ Indexing pressure is primarily generated by external operations such as indexing
 requests or CCR follow tasks. However, there is some internal indexing pressure
 during a shard recovery or primary failover.
 
-Elasticsearch internally monitors indexing load. When the load exceeds certain
-limits, new indexing work will be rejected.
+Elasticsearch internally monitors indexing load. When the load exceeds
+certain limits, new indexing work will be rejected.
 
 [float]
 === Indexing Steps

From e6a66373b1ec8d5a820ff05b33558f2db1aef005 Mon Sep 17 00:00:00 2001
From: Tim Brooks <tim@uncontended.net>
Date: Mon, 13 Jul 2020 13:57:11 -0600
Subject: [PATCH 03/11] Fixes

---
 docs/reference/cluster/nodes-stats.asciidoc   |  4 +--
 .../index-modules/indexing-pressure.asciidoc  | 29 ++++++++++---------
 2 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/docs/reference/cluster/nodes-stats.asciidoc b/docs/reference/cluster/nodes-stats.asciidoc
index 93310b136b95b..79c6dadb2cd51 100644
--- a/docs/reference/cluster/nodes-stats.asciidoc
+++ b/docs/reference/cluster/nodes-stats.asciidoc
@@ -2115,7 +2115,7 @@ Contains indexing pressure statistics for the node.
 (object)
 Contains statistics for cumulative indexing load since the node started.
 +
-.Properties of `<circuit_breaker_name>`
+.Properties of `<total>`
 [%collapsible%open]
 =======
 `coordinating_and_primary_bytes`::
@@ -2142,7 +2142,7 @@ Rejections of indexing requests in the replica stage.
 (object)
 Contains statistics for current indexing load.
 +
-.Properties of `<circuit_breaker_name>`
+.Properties of `<current>`
 [%collapsible%open]
 =======
 `coordinating_and_primary_bytes`::
diff --git a/docs/reference/index-modules/indexing-pressure.asciidoc b/docs/reference/index-modules/indexing-pressure.asciidoc
index a0c40253b5f6f..ba217211de3ed 100644
--- a/docs/reference/index-modules/indexing-pressure.asciidoc
+++ b/docs/reference/index-modules/indexing-pressure.asciidoc
@@ -3,7 +3,7 @@
 
 Indexing documents into Elasticsearch introduces system load in the form of
 memory and CPU load. Each indexing operation includes coordinating, primary, and
-replica steps. These steps can be performed across multiple nodes in the
+replica stages. These stages can be performed across multiple nodes in the
 cluster. If too much indexing work is introduced into the system, the cluster
 can become saturated.
 
@@ -15,20 +15,20 @@ Elasticsearch internally monitors indexing load. When the load exceeds
 certain limits, new indexing work will be rejected.
 
 [float]
-=== Indexing Steps
+=== Indexing Stages
 
-External indexing operations go through three steps. The node receiving the
-indexing is the coordinating node. In the coordinating step the node performs
-any configured ingest pipelines, separates the request into individual shard
-requests, and dispatches them to the primary shards.
+External indexing operations go through three stages. The node receiving the
+indexing request is the coordinating node. In the coordinating stage the node
+performs any configured ingest pipelines, separates the request into individual
+shard requests, and dispatches the requests to the primary shards.
 
-In the primary step, the primary shard node ensures that it is still the primary
+In the primary stage, the primary shard node ensures that it is still the primary
 (otherwise it reroutes the request to the new primary), waits on active shards,
-indexes the documents into its translog and Lucene, and dispatches to the
-replicas.
+indexes the documents into its <<index-modules-translog,Translog>> and Lucene, and
+dispatches to the replicas.
 
-Finally, in the replica step, every available replica node indexes the documents
-into its translog and Lucene.
+Finally, in the replica stage, every available replica node indexes the documents
+into its <<index-modules-translog,Translog>> and Lucene.
 
 
 [float]
@@ -39,10 +39,10 @@ restricts the number of bytes for outstanding indexing requests. Be default,
 this setting is configured to be 10% of the heap.
 
 A node will start rejecting new indexing work at the coordinating or primary
-step when the number of outstanding coordinating, primary and replica indexing
+stage when the number of outstanding coordinating, primary and replica indexing
 requests is greater than the configured limit.
 
-A node will start rejecting new indexing work at the replica step when the
+A node will start rejecting new indexing work at the replica stage when the
 number of outstanding replica indexing requests is greater than 1.5x the
 configured limit. This design means that as indexing pressure builds on nodes,
 they will naturally stop accepting coordinating and primary work in favor of
@@ -52,4 +52,5 @@ outstanding replica work.
 [float]
 === Monitoring
 
-Indexing pressure metrics are exposed by the `GET /_nodes/stats` API.
+Indexing pressure metrics are exposed by the
+<<cluster-nodes-stats-api-response-body-indexing-pressure,Node Stats API>>.

From fdfa56fb773cbc99e23fa23d851d055b8ddb384b Mon Sep 17 00:00:00 2001
From: Tim Brooks <tim@uncontended.net>
Date: Wed, 15 Jul 2020 17:52:33 -0600
Subject: [PATCH 04/11] WIP

---
 docs/reference/cluster/nodes-stats.asciidoc   |  4 ++--
 .../index-modules/indexing-pressure.asciidoc  | 19 +++++++++++++------
 2 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/docs/reference/cluster/nodes-stats.asciidoc b/docs/reference/cluster/nodes-stats.asciidoc
index 79c6dadb2cd51..132840c9844f0 100644
--- a/docs/reference/cluster/nodes-stats.asciidoc
+++ b/docs/reference/cluster/nodes-stats.asciidoc
@@ -59,7 +59,7 @@ using metrics.
       HTTP connection information.
 
   `indexing_pressure`::
-        Indexing pressure statistics about total and current indexing load and
+        Indexing pressure statistics about current and total indexing load and
         indexing rejections.
 
   `indices`::
@@ -2106,7 +2106,7 @@ Number of failed operations for the processor.
 [[cluster-nodes-stats-api-response-body-indexing-pressure]]
 `indexing_pressure`::
 (object)
-Contains indexing pressure statistics for the node.
+Contains <<index-modules-indexing-pressure,indexing pressure>> statistics for the node.
 +
 .Properties of `indexing_pressure`
 [%collapsible%open]
diff --git a/docs/reference/index-modules/indexing-pressure.asciidoc b/docs/reference/index-modules/indexing-pressure.asciidoc
index ba217211de3ed..246b349200cda 100644
--- a/docs/reference/index-modules/indexing-pressure.asciidoc
+++ b/docs/reference/index-modules/indexing-pressure.asciidoc
@@ -5,11 +5,12 @@ Indexing documents into Elasticsearch introduces system load in the form of
 memory and CPU load. Each indexing operation includes coordinating, primary, and
 replica stages. These stages can be performed across multiple nodes in the
 cluster. If too much indexing work is introduced into the system, the cluster
-can become saturated.
+can become saturated. This can adversely impact other components of the system
+such as searches, cluster coordination, background processing, etc.
 
 Indexing pressure is primarily generated by external operations such as indexing
-requests or CCR follow tasks. However, there is some internal indexing pressure
-during a shard recovery or primary failover.
+requests or internally by mechanisms such recoveries and cross-cluster
+replication.
 
 Elasticsearch internally monitors indexing load. When the load exceeds
 certain limits, new indexing work will be rejected.
@@ -24,11 +25,11 @@ shard requests, and dispatches the requests to the primary shards.
 
 In the primary stage, the primary shard node ensures that it is still the primary
 (otherwise it reroutes the request to the new primary), waits on active shards,
-indexes the documents into its <<index-modules-translog,Translog>> and Lucene, and
-dispatches to the replicas.
+indexes the documents into Lucene and the <<index-modules-translog,Translog>>, and
+replicates the writes to the replica shards.
 
 Finally, in the replica stage, every available replica node indexes the documents
-into its <<index-modules-translog,Translog>> and Lucene.
+into Lucene and the <<index-modules-translog,Translog>>.
 
 
 [float]
@@ -48,6 +49,12 @@ configured limit. This design means that as indexing pressure builds on nodes,
 they will naturally stop accepting coordinating and primary work in favor of
 outstanding replica work.
 
+The default limit `indexing_pressure.memory.limit` (10%) is generously sized and
+should only be modified after careful consideration. Only indexing requests
+contribute to this limit meaning that there is additional indexing overhead
+(buffers, listeners, etc) which also require heap space. Finally, other
+components of Elasticsearch also require memory. Configuring this limit to be
+too high can starve other system components of operating memory.
 
 [float]
 === Monitoring

From eea17d421cdf20ae0a0f5ce7007edd830775e4c7 Mon Sep 17 00:00:00 2001
From: Tim Brooks <tim@uncontended.net>
Date: Thu, 16 Jul 2020 17:42:42 -0600
Subject: [PATCH 05/11] Changes

---
 docs/reference/docs/data-replication.asciidoc | 17 ++++++++--
 .../index-modules/indexing-pressure.asciidoc  | 33 +++++++++----------
 2 files changed, 29 insertions(+), 21 deletions(-)

diff --git a/docs/reference/docs/data-replication.asciidoc b/docs/reference/docs/data-replication.asciidoc
index 969e3dfd54ce2..3ee0629808ea4 100644
--- a/docs/reference/docs/data-replication.asciidoc
+++ b/docs/reference/docs/data-replication.asciidoc
@@ -20,12 +20,15 @@ responsible for replicating the operation to the other copies.
 This purpose of this section is to give a high level overview of the Elasticsearch replication model and discuss the implications
 it has for various interactions between write and read operations.
 
-[float]
+[float,id="basic-write-model"]
 ==== Basic write model
 
 Every indexing operation in Elasticsearch is first resolved to a replication group using <<index-routing,routing>>,
-typically based on the document ID. Once the replication group has been determined,
-the operation is forwarded internally to the current _primary shard_ of the group. The primary shard is responsible
+typically based on the document ID. Once the replication group has been determined, the operation is forwarded
+internally to the current _primary shard_ of the group. This stage of indexing is referred to as the coordinating
+stage.
+
+The next stage of indexing is the primary stage which is performed on the primary shard. The primary shard is responsible
 for validating the operation and forwarding it to the other replicas. Since replicas can be offline, the primary
 is not required to replicate to all replicas. Instead, Elasticsearch maintains a list of shard copies that should
 receive the operation. This list is called the _in-sync copies_ and is maintained by the master node. As the name implies,
@@ -42,6 +45,14 @@ The primary shard follows this basic flow:
 . Once all replicas have successfully performed the operation and responded to the primary, the primary acknowledges the successful
    completion of the request to the client.
 
+Each in-sync replica copy performs the indexing operation locally so that it has a copy. This stage of indexing is the replication
+stage.
+
+These indexing stages (coordinating, primary, and replication) are sequential. However, each upstream stage is inclusive of the
+downstream stages. The coordinating stage is not complete until each primary stage (which might be spread out across different
+primary shards) has completed. Each primary stage will not complete until the in-sync replicas have completed replication and
+responded to the replication requests.
+
 [float]
 ===== Failure handling
 
diff --git a/docs/reference/index-modules/indexing-pressure.asciidoc b/docs/reference/index-modules/indexing-pressure.asciidoc
index 246b349200cda..3334f4114824b 100644
--- a/docs/reference/index-modules/indexing-pressure.asciidoc
+++ b/docs/reference/index-modules/indexing-pressure.asciidoc
@@ -18,19 +18,8 @@ certain limits, new indexing work will be rejected.
 [float]
 === Indexing Stages
 
-External indexing operations go through three stages. The node receiving the
-indexing request is the coordinating node. In the coordinating stage the node
-performs any configured ingest pipelines, separates the request into individual
-shard requests, and dispatches the requests to the primary shards.
-
-In the primary stage, the primary shard node ensures that it is still the primary
-(otherwise it reroutes the request to the new primary), waits on active shards,
-indexes the documents into Lucene and the <<index-modules-translog,Translog>>, and
-replicates the writes to the replica shards.
-
-Finally, in the replica stage, every available replica node indexes the documents
-into Lucene and the <<index-modules-translog,Translog>>.
-
+External indexing operations go through three stages: coordinating, primary, and
+replication. This write model is explain <<basic-write-model,here>>.
 
 [float]
 === Memory Limits
@@ -39,15 +28,23 @@ Elasticsearch exposes a node setting `indexing_pressure.memory.limit` which
 restricts the number of bytes for outstanding indexing requests. Be default,
 this setting is configured to be 10% of the heap.
 
+At the beginning of each <<indexing stage,here>>, Elasticsearch accounts for the
+bytes consumed by an indexing request. This accounting is only released at the
+end of the indexing stage. This means that upstream stages will account for the
+request overheard until all downstream stages are complete. For example, the
+coordinating request will remain accounted for until primary and replication
+stages are complete. The primary request will remain accounted for until the
+replication stage is complete.
+
 A node will start rejecting new indexing work at the coordinating or primary
-stage when the number of outstanding coordinating, primary and replica indexing
-requests is greater than the configured limit.
+stage when the number of outstanding coordinating, primary, and replica indexing
+bytes are greater than the configured limit.
 
-A node will start rejecting new indexing work at the replica stage when the
-number of outstanding replica indexing requests is greater than 1.5x the
+A node will start rejecting new indexing work at the replication stage when the
+number of outstanding replication indexing bytes are greater than 1.5x the
 configured limit. This design means that as indexing pressure builds on nodes,
 they will naturally stop accepting coordinating and primary work in favor of
-outstanding replica work.
+outstanding replication work.
 
 The default limit `indexing_pressure.memory.limit` (10%) is generously sized and
 should only be modified after careful consideration. Only indexing requests

From 73cf1a97769f17f56b296b97800b517049441ee2 Mon Sep 17 00:00:00 2001
From: Tim Brooks <tim@uncontended.net>
Date: Fri, 17 Jul 2020 13:31:08 -0600
Subject: [PATCH 06/11] Changes

---
 docs/reference/docs/data-replication.asciidoc    |  8 ++++----
 .../index-modules/indexing-pressure.asciidoc     | 16 ++++++++--------
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/docs/reference/docs/data-replication.asciidoc b/docs/reference/docs/data-replication.asciidoc
index 3ee0629808ea4..d96aa7716c988 100644
--- a/docs/reference/docs/data-replication.asciidoc
+++ b/docs/reference/docs/data-replication.asciidoc
@@ -48,10 +48,10 @@ The primary shard follows this basic flow:
 Each in-sync replica copy performs the indexing operation locally so that it has a copy. This stage of indexing is the replication
 stage.
 
-These indexing stages (coordinating, primary, and replication) are sequential. However, each upstream stage is inclusive of the
-downstream stages. The coordinating stage is not complete until each primary stage (which might be spread out across different
-primary shards) has completed. Each primary stage will not complete until the in-sync replicas have completed replication and
-responded to the replication requests.
+These indexing stages (coordinating, primary, and replica) are sequential. However, the lifetime of each stage encompasses
+the lifetimes of subsequent stages in order to enable internal retries. The coordinating stage is not complete until each primary
+stage (which might be spread out across different primary shards) has completed. Each primary stage will not complete until the
+in-sync replicas have completed indexing the docs locally and responded to the replica requests.
 
 [float]
 ===== Failure handling
diff --git a/docs/reference/index-modules/indexing-pressure.asciidoc b/docs/reference/index-modules/indexing-pressure.asciidoc
index 3334f4114824b..72f4f1a01dc6b 100644
--- a/docs/reference/index-modules/indexing-pressure.asciidoc
+++ b/docs/reference/index-modules/indexing-pressure.asciidoc
@@ -19,32 +19,32 @@ certain limits, new indexing work will be rejected.
 === Indexing Stages
 
 External indexing operations go through three stages: coordinating, primary, and
-replication. This write model is explain <<basic-write-model,here>>.
+replica. This write model is explained <<basic-write-model,here>>.
 
 [float]
 === Memory Limits
 
 Elasticsearch exposes a node setting `indexing_pressure.memory.limit` which
-restricts the number of bytes for outstanding indexing requests. Be default,
+restricts the number of bytes for outstanding indexing requests. By default,
 this setting is configured to be 10% of the heap.
 
 At the beginning of each <<indexing stage,here>>, Elasticsearch accounts for the
 bytes consumed by an indexing request. This accounting is only released at the
 end of the indexing stage. This means that upstream stages will account for the
 request overheard until all downstream stages are complete. For example, the
-coordinating request will remain accounted for until primary and replication
-stages are complete. The primary request will remain accounted for until the
-replication stage is complete.
+coordinating request will remain accounted for until primary and replica
+stages are complete. The primary request will remain accounted for until each
+in-sync replica has responded to enable replica retries if necessary.
 
 A node will start rejecting new indexing work at the coordinating or primary
 stage when the number of outstanding coordinating, primary, and replica indexing
 bytes are greater than the configured limit.
 
-A node will start rejecting new indexing work at the replication stage when the
-number of outstanding replication indexing bytes are greater than 1.5x the
+A node will start rejecting new indexing work at the replica stage when the
+number of outstanding replica indexing bytes are greater than 1.5x the
 configured limit. This design means that as indexing pressure builds on nodes,
 they will naturally stop accepting coordinating and primary work in favor of
-outstanding replication work.
+outstanding replica work.
 
 The default limit `indexing_pressure.memory.limit` (10%) is generously sized and
 should only be modified after careful consideration. Only indexing requests

From 8dfba6636906772f1e249db6bfd718994ea19833 Mon Sep 17 00:00:00 2001
From: Tim Brooks <tim@uncontended.net>
Date: Fri, 17 Jul 2020 14:04:03 -0600
Subject: [PATCH 07/11] Changes

---
 docs/reference/index-modules.asciidoc          |  6 ++++++
 .../index-modules/indexing-pressure.asciidoc   | 18 ++++++++++++++----
 2 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/docs/reference/index-modules.asciidoc b/docs/reference/index-modules.asciidoc
index a2f7c242170dd..6d25f269bfe97 100644
--- a/docs/reference/index-modules.asciidoc
+++ b/docs/reference/index-modules.asciidoc
@@ -281,6 +281,10 @@ Other index settings are available in index modules:
 
     Control over the retention of a history of operations in the index.
 
+<<index-modules-indexing-pressure,Indexing pressure>>::
+
+    Configure indexing back pressure limits.
+
 [float]
 [[x-pack-index-settings]]
 === [xpack]#{xpack} index settings#
@@ -311,3 +315,5 @@ include::index-modules/translog.asciidoc[]
 include::index-modules/history-retention.asciidoc[]
 
 include::index-modules/index-sorting.asciidoc[]
+
+include::index-modules/indexing-pressure.asciidoc[]
diff --git a/docs/reference/index-modules/indexing-pressure.asciidoc b/docs/reference/index-modules/indexing-pressure.asciidoc
index 72f4f1a01dc6b..50b96c62e9435 100644
--- a/docs/reference/index-modules/indexing-pressure.asciidoc
+++ b/docs/reference/index-modules/indexing-pressure.asciidoc
@@ -1,5 +1,5 @@
 [[index-modules-indexing-pressure]]
-== Indexing Pressure
+== Indexing pressure
 
 Indexing documents into Elasticsearch introduces system load in the form of
 memory and CPU load. Each indexing operation includes coordinating, primary, and
@@ -16,15 +16,15 @@ Elasticsearch internally monitors indexing load. When the load exceeds
 certain limits, new indexing work will be rejected.
 
 [float]
-=== Indexing Stages
+=== Indexing stages
 
 External indexing operations go through three stages: coordinating, primary, and
 replica. This write model is explained <<basic-write-model,here>>.
 
 [float]
-=== Memory Limits
+=== Memory limits
 
-Elasticsearch exposes a node setting `indexing_pressure.memory.limit` which
+Elasticsearch exposes a node setting `.memory.limitindexing_pressure` which
 restricts the number of bytes for outstanding indexing requests. By default,
 this setting is configured to be 10% of the heap.
 
@@ -58,3 +58,13 @@ too high can starve other system components of operating memory.
 
 Indexing pressure metrics are exposed by the
 <<cluster-nodes-stats-api-response-body-indexing-pressure,Node Stats API>>.
+
+[float]
+=== Indexing pressure settings
+
+`indexing_pressure`::
+
+  A configurable limit for the number of outstanding bytes consumed by indexing
+  requests. New coordinating and primary operations will start being rejected
+  once this limit is hit. New replica operations will start being rejected when
+  replica operations consumed 1.5X this limit. Defaults to `10%` of the heap.

From 7dffca48516341c9bbf8a459f1a00088e0b15e9f Mon Sep 17 00:00:00 2001
From: Tim Brooks <tim@uncontended.net>
Date: Fri, 17 Jul 2020 14:46:53 -0600
Subject: [PATCH 08/11] Fix

---
 docs/reference/index-modules/indexing-pressure.asciidoc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/reference/index-modules/indexing-pressure.asciidoc b/docs/reference/index-modules/indexing-pressure.asciidoc
index 50b96c62e9435..b13833c727190 100644
--- a/docs/reference/index-modules/indexing-pressure.asciidoc
+++ b/docs/reference/index-modules/indexing-pressure.asciidoc
@@ -28,7 +28,7 @@ Elasticsearch exposes a node setting `.memory.limitindexing_pressure` which
 restricts the number of bytes for outstanding indexing requests. By default,
 this setting is configured to be 10% of the heap.
 
-At the beginning of each <<indexing stage,here>>, Elasticsearch accounts for the
+At the beginning of each indexing stage, Elasticsearch accounts for the
 bytes consumed by an indexing request. This accounting is only released at the
 end of the indexing stage. This means that upstream stages will account for the
 request overheard until all downstream stages are complete. For example, the

From 55d2d5e95bf79262d9554766ea62a37961d549f1 Mon Sep 17 00:00:00 2001
From: Tim Brooks <tim@uncontended.net>
Date: Fri, 17 Jul 2020 16:55:56 -0600
Subject: [PATCH 09/11] WIP

---
 docs/reference/cluster/nodes-stats.asciidoc   | 45 +++++++++++++------
 docs/reference/docs/data-replication.asciidoc | 20 ++++-----
 .../index-modules/indexing-pressure.asciidoc  | 37 ++++++++-------
 3 files changed, 63 insertions(+), 39 deletions(-)

diff --git a/docs/reference/cluster/nodes-stats.asciidoc b/docs/reference/cluster/nodes-stats.asciidoc
index 132840c9844f0..28d937ec77f65 100644
--- a/docs/reference/cluster/nodes-stats.asciidoc
+++ b/docs/reference/cluster/nodes-stats.asciidoc
@@ -59,8 +59,7 @@ using metrics.
       HTTP connection information.
 
   `indexing_pressure`::
-        Indexing pressure statistics about current and total indexing load and
-        indexing rejections.
+        Statistics about the node's indexing load and related rejections.
 
   `indices`::
       Indices stats about size, document count, indexing and deletion times,
@@ -2113,30 +2112,40 @@ Contains <<index-modules-indexing-pressure,indexing pressure>> statistics for th
 ======
 `total`::
 (object)
-Contains statistics for cumulative indexing load since the node started.
+Contains statistics for the cumulative indexing load since the node started.
 +
 .Properties of `<total>`
 [%collapsible%open]
 =======
-`coordinating_and_primary_bytes`::
+`combined_coordinating_and_primary_bytes`::
 (integer)
-Bytes consumed by indexing requests in the coordinating or primary stage.
+Bytes consumed by indexing requests in the coordinating or primary stage. This
+value is not the sum of coordinating_bytes and primary_bytes as a node can reuse
+the coordinating bytes if the primary is executed locally.
 
-`replica_bytes`::
+`coordinating_bytes`::
 (integer)
-Bytes consumed by indexing requests in the replica stage.
+Bytes consumed by indexing requests in the coordinating stage.
+
+`primary_bytes`::
+(integer)
+Bytes consumed by indexing requests in the primary stage.
 
 `all_bytes`::
 (integer)
 Bytes consumed by indexing requests in the coordinating, primary, or replica stage.
 
-`coordinating_and_primary_memory_limit_rejections`::
+`coordinating_rejections`::
+(integer)
+Number of indexing requests rejected in the coordinating stage.
+
+`primary_rejections`::
 (integer)
-Rejections of indexing requests in the coordinating or primary stage.
+Number of indexing requests rejected in the primary stage.
 
-`replica_memory_limit_rejections`::
+`replica_rejections`::
 (integer)
-Rejections of indexing requests in the replica stage.
+Number of indexing requests rejected in the replica stage.
 =======
 `current`::
 (object)
@@ -2145,9 +2154,19 @@ Contains statistics for current indexing load.
 .Properties of `<current>`
 [%collapsible%open]
 =======
-`coordinating_and_primary_bytes`::
+`combined_coordinating_and_primary_bytes`::
+(integer)
+Bytes consumed by indexing requests in the coordinating or primary stage. This
+value is not the sum of coordinating_bytes and primary_bytes as a node can reuse
+the coordinating bytes if the primary is executed locally.
+
+`coordinating_bytes`::
+(integer)
+Bytes consumed by indexing requests in the coordinating stage.
+
+`primary_bytes`::
 (integer)
-Bytes consumed by indexing requests in the coordinating or primary stage.
+Bytes consumed by indexing requests in the primary stage.
 
 `replica_bytes`::
 (integer)
diff --git a/docs/reference/docs/data-replication.asciidoc b/docs/reference/docs/data-replication.asciidoc
index d96aa7716c988..b4bf8c85cad1c 100644
--- a/docs/reference/docs/data-replication.asciidoc
+++ b/docs/reference/docs/data-replication.asciidoc
@@ -20,15 +20,15 @@ responsible for replicating the operation to the other copies.
 This purpose of this section is to give a high level overview of the Elasticsearch replication model and discuss the implications
 it has for various interactions between write and read operations.
 
-[float,id="basic-write-model"]
+[discrete]
+[[basic-write-model]]
 ==== Basic write model
 
 Every indexing operation in Elasticsearch is first resolved to a replication group using <<index-routing,routing>>,
 typically based on the document ID. Once the replication group has been determined, the operation is forwarded
-internally to the current _primary shard_ of the group. This stage of indexing is referred to as the coordinating
-stage.
+internally to the current _primary shard_ of the group. This stage of indexing is referred to as the _coordinating stage_.
 
-The next stage of indexing is the primary stage which is performed on the primary shard. The primary shard is responsible
+The next stage of indexing is the _primary stage_, performed on the primary shard. The primary shard is responsible
 for validating the operation and forwarding it to the other replicas. Since replicas can be offline, the primary
 is not required to replicate to all replicas. Instead, Elasticsearch maintains a list of shard copies that should
 receive the operation. This list is called the _in-sync copies_ and is maintained by the master node. As the name implies,
@@ -45,13 +45,13 @@ The primary shard follows this basic flow:
 . Once all replicas have successfully performed the operation and responded to the primary, the primary acknowledges the successful
    completion of the request to the client.
 
-Each in-sync replica copy performs the indexing operation locally so that it has a copy. This stage of indexing is the replication
-stage.
+Each in-sync replica copy performs the indexing operation locally so that it has a copy. This stage of indexing is the
+_replica stage_.
 
-These indexing stages (coordinating, primary, and replica) are sequential. However, the lifetime of each stage encompasses
-the lifetimes of subsequent stages in order to enable internal retries. The coordinating stage is not complete until each primary
-stage (which might be spread out across different primary shards) has completed. Each primary stage will not complete until the
-in-sync replicas have completed indexing the docs locally and responded to the replica requests.
+These indexing stages (coordinating, primary, and replica) are sequential. To enable internal retries, the lifetime of each stage
+encompasses the lifetime of each subsequent stage. For example, the coordinating stage is not complete until each primary
+stage, which may be spread out across different primary shards, has completed. Each primary stage will not complete until the
+in-sync replicas have finished indexing the docs locally and responded to the replica requests.
 
 [float]
 ===== Failure handling
diff --git a/docs/reference/index-modules/indexing-pressure.asciidoc b/docs/reference/index-modules/indexing-pressure.asciidoc
index b13833c727190..105d4102a5cfb 100644
--- a/docs/reference/index-modules/indexing-pressure.asciidoc
+++ b/docs/reference/index-modules/indexing-pressure.asciidoc
@@ -15,20 +15,22 @@ replication.
 Elasticsearch internally monitors indexing load. When the load exceeds
 certain limits, new indexing work will be rejected.
 
-[float]
+[discrete]
+[[indexing-stages]]
 === Indexing stages
 
 External indexing operations go through three stages: coordinating, primary, and
-replica. This write model is explained <<basic-write-model,here>>.
+replica. See <<basic-write-model,here>>.
 
-[float]
+[discrete]
+[[memory-limits]]
 === Memory limits
 
-Elasticsearch exposes a node setting `.memory.limitindexing_pressure` which
-restricts the number of bytes for outstanding indexing requests. By default,
-this setting is configured to be 10% of the heap.
+The `indexing_pressure.memory.limit` node setting restricts the number of bytes
+available for outstanding indexing requests. This setting defaults to 10% of
+the heap.
 
-At the beginning of each indexing stage, Elasticsearch accounts for the
+At the beginning of each indexing stage, {es} accounts for the
 bytes consumed by an indexing request. This accounting is only released at the
 end of the indexing stage. This means that upstream stages will account for the
 request overheard until all downstream stages are complete. For example, the
@@ -38,13 +40,13 @@ in-sync replica has responded to enable replica retries if necessary.
 
 A node will start rejecting new indexing work at the coordinating or primary
 stage when the number of outstanding coordinating, primary, and replica indexing
-bytes are greater than the configured limit.
+bytes exceeds the configured limit.
 
 A node will start rejecting new indexing work at the replica stage when the
-number of outstanding replica indexing bytes are greater than 1.5x the
-configured limit. This design means that as indexing pressure builds on nodes,
-they will naturally stop accepting coordinating and primary work in favor of
-outstanding replica work.
+number of outstanding replica indexing bytes exceeds 1.5x the configured limit.
+This design means that as indexing pressure builds on nodes, they will naturally
+stop accepting coordinating and primary work in favor of outstanding replica
+work.
 
 The default limit `indexing_pressure.memory.limit` (10%) is generously sized and
 should only be modified after careful consideration. Only indexing requests
@@ -53,13 +55,16 @@ contribute to this limit meaning that there is additional indexing overhead
 components of Elasticsearch also require memory. Configuring this limit to be
 too high can starve other system components of operating memory.
 
-[float]
+[discrete]
+[[indexing-pressure-monitoring]]
 === Monitoring
 
-Indexing pressure metrics are exposed by the
-<<cluster-nodes-stats-api-response-body-indexing-pressure,Node Stats API>>.
+You can use the
+<<cluster-nodes-stats-api-response-body-indexing-pressure,node stats API>> to
+retrieve indexing pressure metrics.
 
-[float]
+[discrete]
+[[indexing-pressure-settings]]
 === Indexing pressure settings
 
 `indexing_pressure`::

From 0daf58f57ba594a815771e362d7001b50e0152dd Mon Sep 17 00:00:00 2001
From: Tim Brooks <tim@uncontended.net>
Date: Mon, 20 Jul 2020 14:45:01 -0600
Subject: [PATCH 10/11] Changes

---
 docs/reference/cluster/nodes-stats.asciidoc   |  2 +-
 .../index-modules/indexing-pressure.asciidoc  | 23 +++++++++----------
 2 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/docs/reference/cluster/nodes-stats.asciidoc b/docs/reference/cluster/nodes-stats.asciidoc
index 28d937ec77f65..3d0c818f9d80c 100644
--- a/docs/reference/cluster/nodes-stats.asciidoc
+++ b/docs/reference/cluster/nodes-stats.asciidoc
@@ -2158,7 +2158,7 @@ Contains statistics for current indexing load.
 (integer)
 Bytes consumed by indexing requests in the coordinating or primary stage. This
 value is not the sum of coordinating_bytes and primary_bytes as a node can reuse
-the coordinating bytes if the primary is executed locally.
+the coordinating bytes if the primary stage is executed locally.
 
 `coordinating_bytes`::
 (integer)
diff --git a/docs/reference/index-modules/indexing-pressure.asciidoc b/docs/reference/index-modules/indexing-pressure.asciidoc
index 105d4102a5cfb..49b35fa394b04 100644
--- a/docs/reference/index-modules/indexing-pressure.asciidoc
+++ b/docs/reference/index-modules/indexing-pressure.asciidoc
@@ -20,7 +20,7 @@ certain limits, new indexing work will be rejected.
 === Indexing stages
 
 External indexing operations go through three stages: coordinating, primary, and
-replica. See <<basic-write-model,here>>.
+replica. See <<basic-write-model>>.
 
 [discrete]
 [[memory-limits]]
@@ -48,12 +48,12 @@ This design means that as indexing pressure builds on nodes, they will naturally
 stop accepting coordinating and primary work in favor of outstanding replica
 work.
 
-The default limit `indexing_pressure.memory.limit` (10%) is generously sized and
-should only be modified after careful consideration. Only indexing requests
-contribute to this limit meaning that there is additional indexing overhead
-(buffers, listeners, etc) which also require heap space. Finally, other
-components of Elasticsearch also require memory. Configuring this limit to be
-too high can starve other system components of operating memory.
+The `indexing_pressure.memory.limit` setting's 10% default limit is generously
+sized. You should only change it after careful consideration. Only indexing
+requests contribute to this limit. This means there is additional indexing
+overhead (buffers, listeners, etc) which also require heap space. Other
+components of {es} also require memory. Setting this limit too high can deny
+operating memory to other operations and components.
 
 [discrete]
 [[indexing-pressure-monitoring]]
@@ -68,8 +68,7 @@ retrieve indexing pressure metrics.
 === Indexing pressure settings
 
 `indexing_pressure`::
-
-  A configurable limit for the number of outstanding bytes consumed by indexing
-  requests. New coordinating and primary operations will start being rejected
-  once this limit is hit. New replica operations will start being rejected when
-  replica operations consumed 1.5X this limit. Defaults to `10%` of the heap.
+  Number of outstanding bytes that may be consumed by indexing requests. When
+  this limit is reached or exceeded, the node will reject new coordinating and
+  primary operations. When replica operations consume 1.5x this limit, the node
+  will reject new replica operations. Defaults to 10% of the heap.

From 692053a7e6a1edbcd7a5fb4c7add39b7f965ad51 Mon Sep 17 00:00:00 2001
From: Tim Brooks <tim@uncontended.net>
Date: Mon, 20 Jul 2020 16:48:00 -0600
Subject: [PATCH 11/11] Changes

---
 docs/reference/cluster/nodes-stats.asciidoc   |  2 +-
 .../index-modules/indexing-pressure.asciidoc  | 21 +++++++++----------
 2 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/docs/reference/cluster/nodes-stats.asciidoc b/docs/reference/cluster/nodes-stats.asciidoc
index 3d0c818f9d80c..0574ea91ff7e6 100644
--- a/docs/reference/cluster/nodes-stats.asciidoc
+++ b/docs/reference/cluster/nodes-stats.asciidoc
@@ -2121,7 +2121,7 @@ Contains statistics for the cumulative indexing load since the node started.
 (integer)
 Bytes consumed by indexing requests in the coordinating or primary stage. This
 value is not the sum of coordinating_bytes and primary_bytes as a node can reuse
-the coordinating bytes if the primary is executed locally.
+the coordinating bytes if the primary stage is executed locally.
 
 `coordinating_bytes`::
 (integer)
diff --git a/docs/reference/index-modules/indexing-pressure.asciidoc b/docs/reference/index-modules/indexing-pressure.asciidoc
index 49b35fa394b04..2e7124d057a37 100644
--- a/docs/reference/index-modules/indexing-pressure.asciidoc
+++ b/docs/reference/index-modules/indexing-pressure.asciidoc
@@ -1,19 +1,18 @@
 [[index-modules-indexing-pressure]]
 == Indexing pressure
 
-Indexing documents into Elasticsearch introduces system load in the form of
-memory and CPU load. Each indexing operation includes coordinating, primary, and
-replica stages. These stages can be performed across multiple nodes in the
-cluster. If too much indexing work is introduced into the system, the cluster
-can become saturated. This can adversely impact other components of the system
-such as searches, cluster coordination, background processing, etc.
+Indexing documents into {es} introduces system load in the form of memory and
+CPU load. Each indexing operation includes coordinating, primary, and replica
+stages. These stages can be performed across multiple nodes in a cluster.
 
-Indexing pressure is primarily generated by external operations such as indexing
-requests or internally by mechanisms such recoveries and cross-cluster
-replication.
+Indexing pressure can build up through external operations, such as indexing
+requests, or internal mechanisms, such as recoveries and {ccr}. If too much
+indexing work is introduced into the system, the cluster can become saturated.
+This can adversely impact other operations, such as search, cluster
+coordination, and background processing.
 
-Elasticsearch internally monitors indexing load. When the load exceeds
-certain limits, new indexing work will be rejected.
+To prevent these issues, {es} internally monitors indexing load. When the load
+exceeds certain limits, new indexing work is rejected
 
 [discrete]
 [[indexing-stages]]