Skip to content

Balancing thresholds DOCS-326 #61

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 12, 2012
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions source/administration/sharding.txt
Original file line number Diff line number Diff line change
Expand Up @@ -974,11 +974,11 @@ sense. Sharding works by migrating chunks between the shards until
each shard has roughly the same number of chunks.

The default chunk size is 64 megabytes. MongoDB will not begin
migrations until the shard with the most chunks has 8 more chunks than
the shard with the fewest chunks. While the default chunk size is
configurable with the :setting:`chunkSize` setting, these behaviors
help prevent unnecessary chunk migrations, which can degrade the
performance of your cluster as a whole.
migrations until the imbalance of chunks in the cluster exceeds the
:ref:`migration threshold <sharding-migration-thresholds>`. While the
default chunk size is configurable with the :setting:`chunkSize`
setting, these behaviors help prevent unnecessary chunk migrations,
which can degrade the performance of your cluster as a whole.

If you have just deployed a shard cluster, make sure that you have
enough data to make sharding effective. If you do not have sufficient
Expand Down
43 changes: 36 additions & 7 deletions source/core/sharding-internals.txt
Original file line number Diff line number Diff line change
Expand Up @@ -263,12 +263,12 @@ Shard Key Indexes

All sharded collections **must** have an index that starts with the
:term:`shard key`. If you shard a collection that does not yet contain
documents and *without* such an index, the :dbcommand:`shardCollection`
documents and *without* such an index, the :dbcommand:`shardCollection`
will create an index on the shard key. If the collection already
contains documents, you must create an appropriate index before using
:dbcommand:`shardCollection`.

.. TODO replace the link
.. TODO replace the link

.. versionchanged:: 2.2
The index on the shard key no longer needs to be identical to the
Expand Down Expand Up @@ -304,7 +304,7 @@ and you want to replace this with an index on the field ``{ zipcode:

If you drop the last appropriate index for the shard key, recover
by recreating a index on just the shard key.

.. index:: balancing; internals
.. _sharding-balancing-internals:

Expand Down Expand Up @@ -333,15 +333,14 @@ chunks in a collection is unevenly distributed among the shards, the
balancer begins migrating :term:`chunks <chunk>` from shards with more
chunks to shards with a fewer number of chunks. The balancer will
continue migrating chunks, one at a time, until the data is evenly
distributed among the shards (i.e. the difference between any two
shards is less than 2 chunks.)
distributed among the shards.

While these automatic chunk migrations are crucial for distributing
data, they carry some overhead in terms of bandwidth and workload,
both of which can impact database performance. As a result, MongoDB
attempts to minimize the effect of balancing by only migrating chunks
when the difference between the number of chunks on shards are greater
than 8.
when the distribution of chunks passes the :ref:`migration thresholds
<sharding-migration-thresholds>`.

.. index:: balancing; migration

Expand All @@ -353,6 +352,36 @@ sends all new writes, to the "receiving" server. Finally,
:program:`mongos` updates the chunk record in the :term:`config
database` to reflect the new location of the chunk.

.. _sharding-migration-thresholds:

Migration Thresholds
~~~~~~~~~~~~~~~~~~~~

.. versionchanged:: 2.2
The following thresholds appear first in 2.2; prior to this
release, balancing would only commence if the shard with the most
chunks had 8 more chunks than the shard with the least number of
chunks.

In order to minimize the impact of balancing on the cluster, the
:term:`balancer` will not begin balancing until the distribution of
chunks has reached certain thresholds. These thresholds apply to the
difference in number of :term:`chunks <chunk>` between the shard with
the greatest number of chunks and the shard with the least number of
chunks. The balancer has the following thresholds:

================ ===================
Number of Chunks Migration Threshold
---------------- -------------------
Greater than 80 8
80-21 4
Less than 20 2
================ ===================

Once a balancing round starts, the balancer will not stop until the
difference between the number of chunks on any two shards is *less
than two.*

.. index:: sharding; chunk size
.. _sharding-chunk-size:

Expand Down
30 changes: 15 additions & 15 deletions source/core/sharding.txt
Original file line number Diff line number Diff line change
Expand Up @@ -191,18 +191,17 @@ Data
Your cluster must manage a significant quantity of data for sharding
to have an effect on your collection. The default :term:`chunk` size
is 64 megabytes, [#chunk-size]_ and the :ref:`balancer
<sharding-balancing>` will not begin moving data until the shard with
the greatest number of chunks has *8 more* chunks than the shard with
least number of chunks.

Practically, this means that unless there is 512 megabytes of data,
all of the data will remain on the same shard. You can set a smaller
chunk size, or :ref:`manually create splits in your collection
<sharding-procedure-create-split>` using the :func:`sh.splitFind()` and
:func:`sh.splitAt()` operations in the :program:`mongo` shell.
Remember that the default chunk size and migration threshold
are explicitly configured to prevent unnecessary splitting or
migrations.
<sharding-balancing>` will not begin moving data until the imbalance
of chunks in the cluster exceeds the :ref:`migration threshold
<sharding-migration-thresholds>`.

Practically, this means that unless your cluster has enough data,
chunks will remain on the same shard. You can set a smaller chunk
size, or :ref:`manually create splits in your collection
<sharding-procedure-create-split>` using the :func:`sh.splitFind()`
and :func:`sh.splitAt()` operations in the :program:`mongo` shell.
Remember that the default chunk size and migration threshold are
explicitly configured to prevent unnecessary splitting or migrations.

While there are some exceptional situations where you may need to
shard a small collection of data, most of the time the additional
Expand Down Expand Up @@ -453,9 +452,10 @@ have on the cluster, by:

- Only moving one chunk at a time.

- Only initiating a balancing round if there is a difference of *more
than* 8 chunks between the shard with the greatest and the shard with
the least number of chunks.
- Only initiating a balancing round when the difference in number of
chunks between the shard with the greatest and the shard with the
least number of chunks exceeds the :ref:`migration threshold
<sharding-migration-thresholds>`.

Additionally, it's possible to disable the balancer on a temporary
basis for maintenance and limit the window during which it runs to
Expand Down