From f9dd0933026a17080497f30a6b6b567eaa7fa60d Mon Sep 17 00:00:00 2001 From: Armin Braun Date: Mon, 9 May 2022 11:29:22 +0200 Subject: [PATCH 01/15] Update shards per resource guidance This guidance does not apply any longer. The overhead per shard has been significantly reduced in recent versions and removed rule of thumb will be too pessimistic in many if not most cases and might be too optimistic in other specific ones. => Replace guidance with rule of thumb per field count on data nodes and rule of thumb by index count (which is far more relevant nowadays than shards) for master nodes. relates #77466 --- .../how-to/size-your-shards.asciidoc | 50 ++++++++----------- 1 file changed, 20 insertions(+), 30 deletions(-) diff --git a/docs/reference/how-to/size-your-shards.asciidoc b/docs/reference/how-to/size-your-shards.asciidoc index f3705530c2a6b..387e7697f5465 100644 --- a/docs/reference/how-to/size-your-shards.asciidoc +++ b/docs/reference/how-to/size-your-shards.asciidoc @@ -68,7 +68,17 @@ This decreases the number of segments, which means less metadata is kept in heap memory. Every mapped field also carries some overhead in terms of memory usage and disk -space. By default {es} will automatically create a mapping for every field in +space. While the exact amount resource usage of each additional field depends +on the type of field, an additional 512 bytes in memory use for each additional +mapped field on data nodes is a good approximation in most cases. +As a result, the more fields the mappings in your indices contain, the fewer +indices and shards will each individual data node be able to hold, +all other things being equal. In addition to the per field overhead you should +at least have an additional 512Mb of heap available per data node. +For example, holding 1000 indices that each contain 4000 fields will require +1000 x 4000 x 512b = 2Gb for the fields and another 512Mb for a total of +at least 2.5Gb on the data node. +By default {es} will automatically create a mapping for every field in every document it indexes, but you can switch off this behaviour to <>. @@ -175,35 +185,15 @@ index prirep shard store [discrete] [[shard-count-recommendation]] -==== Aim for 20 shards or fewer per GB of heap memory - -The number of shards a data node can hold is proportional to the node's heap -memory. For example, a node with 30GB of heap memory should have at most 600 -shards. The further below this limit you can keep your nodes, the better. If -you find your nodes exceeding more than 20 shards per GB, consider adding -another node. - -Some system indices for {enterprise-search-ref}/index.html[Enterprise Search] -are nearly empty and rarely used. Due to their low overhead, you shouldn't -count shards for these indices toward a node's shard limit. - -To check the current size of each node's heap, use the <>. - -[source,console] ----- -GET _cat/nodes?v=true&h=heap.current ----- -// TEST[setup:my_index] - -You can use the <> to check the number of shards per -node. - -[source,console] ----- -GET _cat/shards?v=true ----- -// TEST[setup:my_index] +==== Aim for 3000 indices or fewer per GB of heap memory on master nodes + +The number of indices a master node can manage is proportional to the node's +heap memory. The exact amount of heap memory each additional index requires +depends on various factors. The include but are not limited to the size of its mapping, +the number of shards per index or whether its mapping is shared with other indices. +A good rule of thumb is to aim for 3000 indices or fewer per GB of heap on master nodes. +For example, if your cluster contains 12,000 indices and each master node should have +at least 4GB of heap available. [discrete] [[avoid-node-hotspots]] From a15ef7d787f2ee723fcac65cf17cd452a150d8bd Mon Sep 17 00:00:00 2001 From: Armin Braun Date: Mon, 9 May 2022 12:53:22 +0200 Subject: [PATCH 02/15] CR comments --- .idea/inspectionProfiles/Project_Default.xml | 1 + .../how-to/size-your-shards.asciidoc | 40 ++++++++++++++----- 2 files changed, 31 insertions(+), 10 deletions(-) diff --git a/.idea/inspectionProfiles/Project_Default.xml b/.idea/inspectionProfiles/Project_Default.xml index 3f3eb5218afed..ce63d84e7cf3d 100644 --- a/.idea/inspectionProfiles/Project_Default.xml +++ b/.idea/inspectionProfiles/Project_Default.xml @@ -5,6 +5,7 @@ + \ No newline at end of file diff --git a/docs/reference/how-to/size-your-shards.asciidoc b/docs/reference/how-to/size-your-shards.asciidoc index 387e7697f5465..4f8dbd7bd7620 100644 --- a/docs/reference/how-to/size-your-shards.asciidoc +++ b/docs/reference/how-to/size-your-shards.asciidoc @@ -69,15 +69,15 @@ heap memory. Every mapped field also carries some overhead in terms of memory usage and disk space. While the exact amount resource usage of each additional field depends -on the type of field, an additional 512 bytes in memory use for each additional +on the type of field, an additional 1024B in memory use for each additional mapped field on data nodes is a good approximation in most cases. -As a result, the more fields the mappings in your indices contain, the fewer +As a result, the more mapped fields you have per index, the fewer indices and shards will each individual data node be able to hold, all other things being equal. In addition to the per field overhead you should -at least have an additional 512Mb of heap available per data node. +at least have an additional 512MB of heap available per data node. For example, holding 1000 indices that each contain 4000 fields will require -1000 x 4000 x 512b = 2Gb for the fields and another 512Mb for a total of -at least 2.5Gb on the data node. +1000 x 4000 x 1024b = 4Gb for the fields and another 512Mb for a total of +at least 4.5Gb on the data node. By default {es} will automatically create a mapping for every field in every document it indexes, but you can switch off this behaviour to <>. @@ -189,11 +189,31 @@ index prirep shard store The number of indices a master node can manage is proportional to the node's heap memory. The exact amount of heap memory each additional index requires -depends on various factors. The include but are not limited to the size of its mapping, -the number of shards per index or whether its mapping is shared with other indices. -A good rule of thumb is to aim for 3000 indices or fewer per GB of heap on master nodes. -For example, if your cluster contains 12,000 indices and each master node should have -at least 4GB of heap available. +depends on various factors. Theses include but are not limited to the size of its mapping, +the number of shards per index or whether indices are created identically from the same +template with no dynamic mapping involved. +Under the assumption that indices are mostly created identically from a small number of templates +and no dynamic mapping updates are involved, a good rule of thumb is to aim for 3000 indices or +fewer per GB of heap on master nodes. For example, if your cluster contains 12,000 indices then +each dedicated master node should have at least 4GB of heap available. + +To check the current size of each node's heap, use the <>. + +[source,console] +---- +GET _cat/nodes?v=true&h=heap.current +---- +// TEST[setup:my_index] + +You can use the <> to check the number of shards per +node. + +[source,console] +---- +GET _cat/shards?v=true +---- +// TEST[setup:my_index] [discrete] [[avoid-node-hotspots]] From 70c1491356e2a081e42ec79ad1f544e3d260dcbf Mon Sep 17 00:00:00 2001 From: Armin Braun Date: Mon, 9 May 2022 13:06:14 +0200 Subject: [PATCH 03/15] non dedicated master comment --- docs/reference/how-to/size-your-shards.asciidoc | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/reference/how-to/size-your-shards.asciidoc b/docs/reference/how-to/size-your-shards.asciidoc index 4f8dbd7bd7620..a4f42749aa25a 100644 --- a/docs/reference/how-to/size-your-shards.asciidoc +++ b/docs/reference/how-to/size-your-shards.asciidoc @@ -195,7 +195,8 @@ template with no dynamic mapping involved. Under the assumption that indices are mostly created identically from a small number of templates and no dynamic mapping updates are involved, a good rule of thumb is to aim for 3000 indices or fewer per GB of heap on master nodes. For example, if your cluster contains 12,000 indices then -each dedicated master node should have at least 4GB of heap available. +each dedicated master node should have at least 4GB of heap available. For non-dedicated master +nodes, the same rule holds and should be added to the heap requirements of other node roles. To check the current size of each node's heap, use the <>. From 869ffc91febbd85b6c47ac33106153b638a12a32 Mon Sep 17 00:00:00 2001 From: Armin Braun Date: Mon, 9 May 2022 19:00:10 +0200 Subject: [PATCH 04/15] CR: david --- .idea/inspectionProfiles/Project_Default.xml | 3 +-- docs/reference/how-to/size-your-shards.asciidoc | 14 ++++++++------ 2 files changed, 9 insertions(+), 8 deletions(-) diff --git a/.idea/inspectionProfiles/Project_Default.xml b/.idea/inspectionProfiles/Project_Default.xml index ce63d84e7cf3d..67dec301388da 100644 --- a/.idea/inspectionProfiles/Project_Default.xml +++ b/.idea/inspectionProfiles/Project_Default.xml @@ -5,7 +5,6 @@ - - \ No newline at end of file + diff --git a/docs/reference/how-to/size-your-shards.asciidoc b/docs/reference/how-to/size-your-shards.asciidoc index a4f42749aa25a..2c9b1ec3c04d6 100644 --- a/docs/reference/how-to/size-your-shards.asciidoc +++ b/docs/reference/how-to/size-your-shards.asciidoc @@ -72,12 +72,12 @@ space. While the exact amount resource usage of each additional field depends on the type of field, an additional 1024B in memory use for each additional mapped field on data nodes is a good approximation in most cases. As a result, the more mapped fields you have per index, the fewer -indices and shards will each individual data node be able to hold, +indices and shards each individual data node will be able to hold, all other things being equal. In addition to the per field overhead you should at least have an additional 512MB of heap available per data node. For example, holding 1000 indices that each contain 4000 fields will require -1000 x 4000 x 1024b = 4Gb for the fields and another 512Mb for a total of -at least 4.5Gb on the data node. +1000 × 4000 × 1024B = 4Gb for the fields and another 512MB for a total of +at least 4.5GB on the data node. By default {es} will automatically create a mapping for every field in every document it indexes, but you can switch off this behaviour to <>. @@ -189,9 +189,11 @@ index prirep shard store The number of indices a master node can manage is proportional to the node's heap memory. The exact amount of heap memory each additional index requires -depends on various factors. Theses include but are not limited to the size of its mapping, -the number of shards per index or whether indices are created identically from the same -template with no dynamic mapping involved. +depends on various factors such as the size of the mapping and the number of +shards per index. Where possible, the master node will deduplicate the mapping +metadata across indices with identical mappings. Mapping metadata can be +deduplicated across indices that are created from the same index template as +long as they do not use dynamic mapping updates. Under the assumption that indices are mostly created identically from a small number of templates and no dynamic mapping updates are involved, a good rule of thumb is to aim for 3000 indices or fewer per GB of heap on master nodes. For example, if your cluster contains 12,000 indices then From dcb50675b6c3b16460757e2e78be29e2c1a6e38e Mon Sep 17 00:00:00 2001 From: Armin Braun Date: Mon, 9 May 2022 19:01:10 +0200 Subject: [PATCH 05/15] meh wl --- .idea/inspectionProfiles/Project_Default.xml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.idea/inspectionProfiles/Project_Default.xml b/.idea/inspectionProfiles/Project_Default.xml index 67dec301388da..3f3eb5218afed 100644 --- a/.idea/inspectionProfiles/Project_Default.xml +++ b/.idea/inspectionProfiles/Project_Default.xml @@ -7,4 +7,4 @@ - + \ No newline at end of file From 9e460a86a9852bca63d432022d84b7a1217381d0 Mon Sep 17 00:00:00 2001 From: Armin Braun Date: Mon, 9 May 2022 19:02:39 +0200 Subject: [PATCH 06/15] small b' --- docs/reference/how-to/size-your-shards.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/reference/how-to/size-your-shards.asciidoc b/docs/reference/how-to/size-your-shards.asciidoc index 2c9b1ec3c04d6..16eda9c0399bb 100644 --- a/docs/reference/how-to/size-your-shards.asciidoc +++ b/docs/reference/how-to/size-your-shards.asciidoc @@ -76,7 +76,7 @@ indices and shards each individual data node will be able to hold, all other things being equal. In addition to the per field overhead you should at least have an additional 512MB of heap available per data node. For example, holding 1000 indices that each contain 4000 fields will require -1000 × 4000 × 1024B = 4Gb for the fields and another 512MB for a total of +1000 × 4000 × 1024B = 4GB for the fields and another 512MB for a total of at least 4.5GB on the data node. By default {es} will automatically create a mapping for every field in every document it indexes, but you can switch off this behaviour to From 936d9da75177a89aac5ff984e6767be13dde2e7c Mon Sep 17 00:00:00 2001 From: Armin Braun Date: Tue, 10 May 2022 18:35:54 +0200 Subject: [PATCH 07/15] CR: adjustments --- docs/reference/how-to/size-your-shards.asciidoc | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/docs/reference/how-to/size-your-shards.asciidoc b/docs/reference/how-to/size-your-shards.asciidoc index 16eda9c0399bb..ea04ea02777d3 100644 --- a/docs/reference/how-to/size-your-shards.asciidoc +++ b/docs/reference/how-to/size-your-shards.asciidoc @@ -69,15 +69,18 @@ heap memory. Every mapped field also carries some overhead in terms of memory usage and disk space. While the exact amount resource usage of each additional field depends -on the type of field, an additional 1024B in memory use for each additional +on the type of field, an additional 1KB in memory use for each additional mapped field on data nodes is a good approximation in most cases. As a result, the more mapped fields you have per index, the fewer indices and shards each individual data node will be able to hold, all other things being equal. In addition to the per field overhead you should -at least have an additional 512MB of heap available per data node. +have heap available for the base Elasticsearch heap usage as well as the heap +usage of the workload. For a small workload, 512MB of heap in addition to the +per field overhead suffice in many cases. For tight workloads, it is possible to +go even lower. For example, holding 1000 indices that each contain 4000 fields will require 1000 × 4000 × 1024B = 4GB for the fields and another 512MB for a total of -at least 4.5GB on the data node. +at least 4.5GB heap on the data node. By default {es} will automatically create a mapping for every field in every document it indexes, but you can switch off this behaviour to <>. From af59baabc5cc436582b3f5378187d9866908c493 Mon Sep 17 00:00:00 2001 From: Armin Braun Date: Wed, 11 May 2022 10:16:36 +0200 Subject: [PATCH 08/15] fix double space --- docs/reference/how-to/size-your-shards.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/reference/how-to/size-your-shards.asciidoc b/docs/reference/how-to/size-your-shards.asciidoc index ea04ea02777d3..3a8dd6a7d6fb7 100644 --- a/docs/reference/how-to/size-your-shards.asciidoc +++ b/docs/reference/how-to/size-your-shards.asciidoc @@ -69,7 +69,7 @@ heap memory. Every mapped field also carries some overhead in terms of memory usage and disk space. While the exact amount resource usage of each additional field depends -on the type of field, an additional 1KB in memory use for each additional +on the type of field, an additional 1KB in memory use for each additional mapped field on data nodes is a good approximation in most cases. As a result, the more mapped fields you have per index, the fewer indices and shards each individual data node will be able to hold, From f3428c6e1674f947d062fafee00f7351ecf4a050 Mon Sep 17 00:00:00 2001 From: Armin Braun Date: Wed, 11 May 2022 11:39:38 +0200 Subject: [PATCH 09/15] Update docs/reference/how-to/size-your-shards.asciidoc Co-authored-by: David Turner --- docs/reference/how-to/size-your-shards.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/reference/how-to/size-your-shards.asciidoc b/docs/reference/how-to/size-your-shards.asciidoc index 3a8dd6a7d6fb7..08602cb543fec 100644 --- a/docs/reference/how-to/size-your-shards.asciidoc +++ b/docs/reference/how-to/size-your-shards.asciidoc @@ -69,7 +69,7 @@ heap memory. Every mapped field also carries some overhead in terms of memory usage and disk space. While the exact amount resource usage of each additional field depends -on the type of field, an additional 1KB in memory use for each additional +on the type of field, an additional 1kB in memory use for each additional mapped field on data nodes is a good approximation in most cases. As a result, the more mapped fields you have per index, the fewer indices and shards each individual data node will be able to hold, From cca3ac9c1aa0299c8f07f19b9fb21e975eb3f348 Mon Sep 17 00:00:00 2001 From: Armin Braun Date: Wed, 11 May 2022 11:39:43 +0200 Subject: [PATCH 10/15] Update docs/reference/how-to/size-your-shards.asciidoc Co-authored-by: David Turner --- docs/reference/how-to/size-your-shards.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/reference/how-to/size-your-shards.asciidoc b/docs/reference/how-to/size-your-shards.asciidoc index 08602cb543fec..4366049d27f38 100644 --- a/docs/reference/how-to/size-your-shards.asciidoc +++ b/docs/reference/how-to/size-your-shards.asciidoc @@ -76,7 +76,7 @@ indices and shards each individual data node will be able to hold, all other things being equal. In addition to the per field overhead you should have heap available for the base Elasticsearch heap usage as well as the heap usage of the workload. For a small workload, 512MB of heap in addition to the -per field overhead suffice in many cases. For tight workloads, it is possible to +per field overhead suffice in many cases. For light workloads, it may be possible to go even lower. For example, holding 1000 indices that each contain 4000 fields will require 1000 × 4000 × 1024B = 4GB for the fields and another 512MB for a total of From b4f0e0f3ebfa561db954f6f345ae3d86ae3e0917 Mon Sep 17 00:00:00 2001 From: David Turner Date: Wed, 25 May 2022 09:28:51 +0100 Subject: [PATCH 11/15] Wordsmithery --- .../how-to/size-your-shards.asciidoc | 60 ++++++++++--------- 1 file changed, 33 insertions(+), 27 deletions(-) diff --git a/docs/reference/how-to/size-your-shards.asciidoc b/docs/reference/how-to/size-your-shards.asciidoc index 4366049d27f38..80495d395ee13 100644 --- a/docs/reference/how-to/size-your-shards.asciidoc +++ b/docs/reference/how-to/size-your-shards.asciidoc @@ -55,7 +55,7 @@ thread pool>>. This can result in low throughput and slow search speeds. [discrete] [[each-shard-has-overhead]] -==== Each index and shard has overhead +==== Each index, shard and field has overhead Every index and every shard requires some memory and CPU resources. In most cases, a small set of large shards uses fewer resources than many small shards. @@ -68,20 +68,7 @@ This decreases the number of segments, which means less metadata is kept in heap memory. Every mapped field also carries some overhead in terms of memory usage and disk -space. While the exact amount resource usage of each additional field depends -on the type of field, an additional 1kB in memory use for each additional -mapped field on data nodes is a good approximation in most cases. -As a result, the more mapped fields you have per index, the fewer -indices and shards each individual data node will be able to hold, -all other things being equal. In addition to the per field overhead you should -have heap available for the base Elasticsearch heap usage as well as the heap -usage of the workload. For a small workload, 512MB of heap in addition to the -per field overhead suffice in many cases. For light workloads, it may be possible to -go even lower. -For example, holding 1000 indices that each contain 4000 fields will require -1000 × 4000 × 1024B = 4GB for the fields and another 512MB for a total of -at least 4.5GB heap on the data node. -By default {es} will automatically create a mapping for every field in +space. By default {es} will automatically create a mapping for every field in every document it indexes, but you can switch off this behaviour to <>. @@ -190,18 +177,21 @@ index prirep shard store [[shard-count-recommendation]] ==== Aim for 3000 indices or fewer per GB of heap memory on master nodes -The number of indices a master node can manage is proportional to the node's -heap memory. The exact amount of heap memory each additional index requires -depends on various factors such as the size of the mapping and the number of -shards per index. Where possible, the master node will deduplicate the mapping -metadata across indices with identical mappings. Mapping metadata can be -deduplicated across indices that are created from the same index template as -long as they do not use dynamic mapping updates. -Under the assumption that indices are mostly created identically from a small number of templates -and no dynamic mapping updates are involved, a good rule of thumb is to aim for 3000 indices or -fewer per GB of heap on master nodes. For example, if your cluster contains 12,000 indices then -each dedicated master node should have at least 4GB of heap available. For non-dedicated master -nodes, the same rule holds and should be added to the heap requirements of other node roles. +The number of indices a master node can manage is proportional to its heap +size. The exact amount of heap memory needed for each index depends on various +factors such as the size of the mapping and the number of shards per index. + +Where possible, the master node will deduplicate the mapping metadata across +indices with identical mappings. Mapping metadata will be deduplicated across +indices which are created from the same index template and +<>. + +If your indices are are mostly created from a small number of templates and do +not use dynamic mapping updates then aim for 3000 indices or fewer per GB of +heap on master nodes. For example, if your cluster contains 12000 indices then +each dedicated master node should have at least 4GB of heap. For non-dedicated +master nodes, the same rule holds and should be added to the heap requirements +of the other roles of each node. To check the current size of each node's heap, use the <>. @@ -221,6 +211,22 @@ GET _cat/shards?v=true ---- // TEST[setup:my_index] +[discrete] +[[field-count-recommendation]] +==== Allow 1kB of heap per field per index on data nodes, plus overheads + +The exact resource usage of each mapped field depends on its type, but in many +cases you should allow approximately 1kB of heap overhead per mapped field per +index held by each data node. You must also allow enough heap for {es}'s +baseline usage as well as your workload such as indexing, searches and +aggregations. 0.5GB of extra heap will suffice for many reasonable workloads, +and you may need even less if your workload is very light. + +For example, if a data node holds shards from 1000 indices, each containing +4000 mapped fields, then you should allow approximately 1000 × 4000 × 1kB = 4GB +of heap for the fields and another 0.5GB of heap for its workload and other +overheads, and therefore this node will need a heap size of at least 4.5GB. + [discrete] [[avoid-node-hotspots]] ==== Avoid node hotspots From d8b6484ac2a7526414df61b14ed968741feb280a Mon Sep 17 00:00:00 2001 From: Armin Braun Date: Wed, 1 Jun 2022 15:13:20 +0200 Subject: [PATCH 12/15] Update docs/reference/how-to/size-your-shards.asciidoc Co-authored-by: Henning Andersen <33268011+henningandersen@users.noreply.github.com> --- docs/reference/how-to/size-your-shards.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/reference/how-to/size-your-shards.asciidoc b/docs/reference/how-to/size-your-shards.asciidoc index 80495d395ee13..5a47312f1024a 100644 --- a/docs/reference/how-to/size-your-shards.asciidoc +++ b/docs/reference/how-to/size-your-shards.asciidoc @@ -175,7 +175,7 @@ index prirep shard store [discrete] [[shard-count-recommendation]] -==== Aim for 3000 indices or fewer per GB of heap memory on master nodes +==== Aim for 3000 indices or fewer per GB of heap memory on each master node The number of indices a master node can manage is proportional to its heap size. The exact amount of heap memory needed for each index depends on various From ad6af0fc3e2471710679373860a905af5eb24052 Mon Sep 17 00:00:00 2001 From: Armin Braun Date: Wed, 1 Jun 2022 15:26:01 +0200 Subject: [PATCH 13/15] CR: comments --- docs/reference/how-to/size-your-shards.asciidoc | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/reference/how-to/size-your-shards.asciidoc b/docs/reference/how-to/size-your-shards.asciidoc index 90c4abb3f6065..3584e681faa9b 100644 --- a/docs/reference/how-to/size-your-shards.asciidoc +++ b/docs/reference/how-to/size-your-shards.asciidoc @@ -186,7 +186,7 @@ indices with identical mappings. Mapping metadata will be deduplicated across indices which are created from the same index template and <>. -If your indices are are mostly created from a small number of templates and do +If your indices are mostly created from a small number of templates and do not use dynamic mapping updates then aim for 3000 indices or fewer per GB of heap on master nodes. For example, if your cluster contains 12000 indices then each dedicated master node should have at least 4GB of heap. For non-dedicated @@ -215,9 +215,9 @@ GET _cat/shards?v=true [[field-count-recommendation]] ==== Allow 1kB of heap per field per index on data nodes, plus overheads -The exact resource usage of each mapped field depends on its type, but in many -cases you should allow approximately 1kB of heap overhead per mapped field per -index held by each data node. You must also allow enough heap for {es}'s +The exact resource usage of each mapped field depends on its type, but a rule +of thumb is to allow for approximately 1kB of heap overhead per mapped field +per index held by each data node. You must also allow enough heap for {es}'s baseline usage as well as your workload such as indexing, searches and aggregations. 0.5GB of extra heap will suffice for many reasonable workloads, and you may need even less if your workload is very light. From 43d0bc611872a00ab8ff80fe6e44b25fab8a031a Mon Sep 17 00:00:00 2001 From: Armin Braun Date: Thu, 2 Jun 2022 14:54:19 +0200 Subject: [PATCH 14/15] add require more phrase --- docs/reference/how-to/size-your-shards.asciidoc | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/reference/how-to/size-your-shards.asciidoc b/docs/reference/how-to/size-your-shards.asciidoc index 3584e681faa9b..5b03ee2b67be1 100644 --- a/docs/reference/how-to/size-your-shards.asciidoc +++ b/docs/reference/how-to/size-your-shards.asciidoc @@ -220,7 +220,8 @@ of thumb is to allow for approximately 1kB of heap overhead per mapped field per index held by each data node. You must also allow enough heap for {es}'s baseline usage as well as your workload such as indexing, searches and aggregations. 0.5GB of extra heap will suffice for many reasonable workloads, -and you may need even less if your workload is very light. +and you may need even less if your workload is very light while heavy workloads +may require more. For example, if a data node holds shards from 1000 indices, each containing 4000 mapped fields, then you should allow approximately 1000 × 4000 × 1kB = 4GB From 588bc870eabf16c98afbed46d02f34e943a4e410 Mon Sep 17 00:00:00 2001 From: Armin Braun Date: Thu, 9 Jun 2022 14:57:38 +0200 Subject: [PATCH 15/15] remove duplicate mapping details --- docs/reference/how-to/size-your-shards.asciidoc | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/docs/reference/how-to/size-your-shards.asciidoc b/docs/reference/how-to/size-your-shards.asciidoc index 5b03ee2b67be1..c06986d405f9b 100644 --- a/docs/reference/how-to/size-your-shards.asciidoc +++ b/docs/reference/how-to/size-your-shards.asciidoc @@ -181,13 +181,7 @@ The number of indices a master node can manage is proportional to its heap size. The exact amount of heap memory needed for each index depends on various factors such as the size of the mapping and the number of shards per index. -Where possible, the master node will deduplicate the mapping metadata across -indices with identical mappings. Mapping metadata will be deduplicated across -indices which are created from the same index template and -<>. - -If your indices are mostly created from a small number of templates and do -not use dynamic mapping updates then aim for 3000 indices or fewer per GB of +As a general rule of thumb, you should aim for 3000 indices or fewer per GB of heap on master nodes. For example, if your cluster contains 12000 indices then each dedicated master node should have at least 4GB of heap. For non-dedicated master nodes, the same rule holds and should be added to the heap requirements