Expand How to tune for disk usage #25562

cwurm · 2017-07-05T20:52:47Z

Adding sections on:

Shard size (20-30 GB)
Disabling _source when possible
Force Merge
Shrink

More suggestions welcome.

Adding sections on shard size (20-30 GB), disabling `_source` when possible, Force Merge, and Shrink.

jasontedor

I left a comment.

jasontedor · 2017-07-06T02:03:36Z

docs/reference/how-to/disk-usage.asciidoc


+[float]
+=== Watch your shard size
+


I don't think we should specify a size here as it depends on too many factors. With fast replica recovery coming that is another mitigating factor (#22484) to one of the drawbacks that you mention.

@jasontedor Makes sense. Do you think we should mention an upper range, e.g. 50 GB?

It's a good question but I don't think so. We are still fighting the "30 GB" heap recommendation, too many people see that number and think it's the magical number where they should set their heap without enough consideration for all the factors involved. Instead, I think that the verbiage is good but we should avoid enshrining specific numbers.

cwurm · 2017-07-07T18:48:52Z

@jasontedor I've updated the shard size recommendation.

jasontedor

I left a few suggestions.

jasontedor · 2017-07-08T11:49:14Z

docs/reference/how-to/disk-usage.asciidoc


+[float]
+=== Force Merge
+


Perhaps turn this around a bit: Elasticsearch stores data in shards. Shards are Lucene indices and are composed of segments. Segments are the actual files on disk, etc.

Makes sense, how about: "Indices in Elasticsearch are stored in one or more shards. Each shard is a Lucene index and made up of one or more segments - the actual files on disk. Larger segments are more efficient for storing data. The <<indices-forcemerge,_forcemerge API>> can be [...]"

That sounds good to me.

jasontedor · 2017-07-08T11:51:31Z

docs/reference/how-to/disk-usage.asciidoc

+=== Watch your shard size
+
+Larger shards are going to be more efficient at storing data. To increase the size of your shards, you can decrease the number of primary shards in an index by <<indices-create-index,creating indices>> with less primary shards, creating less indices (e.g. by leveraging the <<indices-rollover-index,Rollover API>>), or modifying an existing index using the <<indices-shrink-index,Shrink API>>.
+


I do wonder if a comment is in order here about how this applies to full recoveries.

Maybe, I'm not sure under which circumstances we'd do a full recovery. Can you suggest a wording?

@jasontedor ping

Sorry, I've been on vacation. I will resume reviewing when I'm fully back tomorrow.

I don't think we need anything elaborate here, something like: "Keep in mind that large shard sizes come with drawbacks such as long full recovery times."

cwurm · 2017-08-07T20:12:51Z

@jasontedor Incorporated your suggestions, thanks a lot. How does it look?

jasontedor

LGTM.

Expand How to tune for disk usage

eac143b

Adding sections on shard size (20-30 GB), disabling `_source` when possible, Force Merge, and Shrink.

cwurm added the >docs General docs changes label Jul 5, 2017

jasontedor requested changes Jul 6, 2017

View reviewed changes

Update shard size recommendation

07280bf

jasontedor reviewed Jul 8, 2017

View reviewed changes

Clarify shard recovery and segments

9448918

jasontedor approved these changes Aug 19, 2017

View reviewed changes

cwurm merged commit 0120448 into master Aug 21, 2017

cwurm deleted the cwurm-docs-disk-usage branch August 21, 2017 19:08

cwurm added a commit that referenced this pull request Aug 23, 2017

Expand How to tune for disk usage (#25562)

907a5eb

cwurm added a commit that referenced this pull request Aug 24, 2017

Expand How to tune for disk usage (#25562)

c3ab29e

		=== Watch your shard size

		Larger shards are going to be more efficient at storing data. To increase the size of your shards, you can decrease the number of primary shards in an index by <<indices-create-index,creating indices>> with less primary shards, creating less indices (e.g. by leveraging the <<indices-rollover-index,Rollover API>>), or modifying an existing index using the <<indices-shrink-index,Shrink API>>.

Expand How to tune for disk usage #25562

Expand How to tune for disk usage #25562

Uh oh!

Conversation

cwurm commented Jul 5, 2017

Uh oh!

jasontedor left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cwurm commented Jul 7, 2017

Uh oh!

jasontedor left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cwurm commented Aug 7, 2017

Uh oh!

jasontedor left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants