Skip to content

Commit 870199b

Browse files
author
Sam Kleinman
committed
DOCS-831: edits and reorg
1 parent c0f41a7 commit 870199b

File tree

1 file changed

+46
-35
lines changed

1 file changed

+46
-35
lines changed

source/core/sharding-internals.txt

Lines changed: 46 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -531,41 +531,6 @@ a document with a ``msg`` field that holds the string
531531
If the application is instead connected to a :program:`mongod`, the
532532
returned document does not include the ``isdbgrid`` string.
533533

534-
Shard GridFS Documents
535-
----------------------
536-
537-
A common way to shard :term:`GridFS` is to configure the shard as follows:
538-
539-
- Do not shard the ``files`` collection, as the keys in this collection do
540-
not easily lend themselves to even distributions.
541-
542-
Leaving ``files`` unsharded means that all the file metadata documents
543-
live on one shard. It is recommended that the shard is a replica set
544-
with at least three members, for high availability.
545-
546-
- Shard the ``chunks`` collection using a new ``files_id : 1 , n : 1``
547-
index. You must create this index. Do not use the existing
548-
``files_id : 1 , n : 1`` index already created by the drivers.
549-
550-
The new ``files_id : 1 , n : 1`` index ensures that all chunks of a
551-
given file live on the same shard, which is safer and allows FileMD5
552-
hashing.
553-
554-
To shard the ``chunks`` collection by ``files_id : 1 , n : 1``, issue
555-
commands similar to the following:
556-
557-
.. code-block:: javascript
558-
559-
db.fs.chunks.ensureIndex( { files_id : 1 , n : 1 } )
560-
561-
db.runCommand( { shardcollection : "test.fs.chunks" , key : { files_id : 1 , n : 1 } } )
562-
563-
The default ``files_id`` is an :term:`ObjectId`. The ``files_id`` is
564-
ascending, and all GridFS chunks are sent to a single sharding chunk.
565-
If your write load is too high for a single server to handle, you may
566-
want to shard on a different key or use a different value for ``_id``
567-
in the ``files`` collection.
568-
569534
.. index:: config database
570535
.. index:: database, config
571536
.. _sharding-internals-config-database:
@@ -603,3 +568,49 @@ collections:
603568

604569
See :doc:`/reference/config-database` for full documentation of these
605570
collections and their role in sharded clusters.
571+
572+
Sharding GridFS Stores
573+
----------------------
574+
575+
When sharding a :term:`GridFS` store, consider the following:
576+
577+
- Most deployments will not need to shard the ``files``
578+
collection. The ``files`` collection is typically small, and only
579+
contains metadata. None of the required keys for GridFS lend
580+
themselves to an even distribution in a sharded situation. If you
581+
*must* shard the ``files`` collection, use the ``_id`` field
582+
possibly in combination with an application field
583+
584+
Leaving ``files`` unsharded means that all the file metadata
585+
documents live on one shard. For production GridFS stores you *must*
586+
store the ``files`` collection on a replica set.
587+
588+
- To shard the ``chunks`` collection by ``{ files_id : 1 , n : 1 }``,
589+
issue commands similar to the following:
590+
591+
.. code-block:: javascript
592+
593+
db.fs.chunks.ensureIndex( { files_id : 1 , n : 1 } )
594+
595+
db.runCommand( { shardcollection : "test.fs.chunks" , key : { files_id : 1 , n : 1 } } )
596+
597+
You may also want shard using just the ``file_id`` field, as in the
598+
following operation:
599+
600+
.. code-block:: javascript
601+
602+
db.runCommand( { shardcollection : "test.fs.chunks" , key : { files_id : 1 } } )
603+
604+
.. note::
605+
606+
.. versionchanged:: 2.2
607+
608+
Before 2.2, you had to create an additional index on ``files_id``
609+
to shard using *only* this field.
610+
611+
The default ``files_id`` value is an :term:`ObjectId`, as a result
612+
the values of ``files_id`` are always ascending, and applications
613+
will insert all new GridFS data to a single chunk and shard. If
614+
your write load is too high for a single server to handle, consider
615+
a different shard key or use a different value for different value
616+
for ``_id`` in the ``files`` collection.

0 commit comments

Comments
 (0)