Skip to content

Commit 2c1f9ff

Browse files
committed
DOCS-149 final FAQ review.
1 parent da4f654 commit 2c1f9ff

File tree

1 file changed

+78
-81
lines changed

1 file changed

+78
-81
lines changed

source/faq/sharding.rst

Lines changed: 78 additions & 81 deletions
Original file line numberDiff line numberDiff line change
@@ -58,8 +58,6 @@ MongoDB will assign various ranges of collection data to the different
5858
shards in the cluster. The cluster will correct imbalances between shards
5959
by migrating ranges of data from one shard to another.
6060

61-
TODO: continue from here.
62-
6361
What happens if a client updates a document in a chunk during a migration?
6462
--------------------------------------------------------------------------
6563

@@ -73,118 +71,117 @@ What happens to queries if a shard is inaccessible or slow?
7371
-----------------------------------------------------------
7472

7573
If a :term:`shard` is inaccessible or unavailable, queries will return
76-
with an error, query will return an error unless the client sets the
77-
"Partial" query option. Conversely, if a shard is responding slowly,
78-
:program:`mongos` will wait for the shard to return results.
74+
with an error.
75+
76+
However, a client may set the ``partial`` query bit, which will then
77+
return results from all available shards, regardless of whether a
78+
given shard is unavailable.
79+
80+
If a shard is responding slowly,
81+
:program:`mongos` will merely wait for the shard to return results.
7982

80-
:program:`mongos` does not return partial results unless specifically
81-
configured.
8283

8384
How does MongoDB distribute queries among shards?
8485
-------------------------------------------------
8586

8687
The exact method for distributing queries among a :term:`shard
8788
cluster` depends on the nature of the query and the configuration of
8889
the shard cluster. Consider a sharded collection, using the
89-
:term:`shard key` "``X``", that has "``Y``" and "``Z``" attributes:
90+
:term:`shard key` "``user_id``", that has "``last_login``" and "``email``" attributes:
9091

91-
- For a query that selects "``X``" and also sorts by "``X``":
92+
- For a query that selects "``user_id``" and also sorts by "``user_id``":
9293

9394
:program:`mongos` can make a straightforward translation of this
9495
operation into a series of queries against successive shards,
95-
ordered by "``X``". This is faster than querying all shards in
96+
ordered by "``user_id``". This is faster than querying all shards in
9697
parallel because :program:`mongos` can determine which shards
9798
contain the relevant chunks without waiting for all shards to return
9899
results.
99100

100-
- For queries that select on "``X``" and sorts by "``Y``":
101+
- For queries that select on "``user_id``" and sort by "``last_login``":
101102

102103
:program:`mongos` executes queries in parallel on
103-
the appropriate shards, and performs a merge-sort on the "``Y``" key
104+
the appropriate shards, and performs a merge-sort on the "``last_login``" key
104105
of all documents returned from the shards.
105106

106-
- For queries that select on "``Y``:
107+
- For queries that select on "``last_login``:
107108

108109
These queries must run on all shards:
109110

110-
- When query sorts by "``X``, :program:`mongos` serializes the query
111-
over the shards in ordered by "``X``".
111+
- When the query sorts by "``last_login``, :program:`mongos` serializes the query
112+
over the shards in ordered by "``last_login``".
112113

113-
- If the query sorts by "``Z``", :program:`mongos` must parallelize
114-
the query over the shards and perform a merge-sort on the "``Z``"
114+
- If the query sorts by "``email``", :program:`mongos` must parallelize
115+
the query over the shards and perform a merge-sort on the "``email``"
115116
of the documents found.
116117

117118
How does MongoDB sort queries in sharded environments?
118119
------------------------------------------------------
119120

120-
If you specify call the :func:`sort()` method on a query in a sharded
121+
If you call the :func:`sort()` method on a query in a sharded
121122
environment, the :program:`mongod` for each shard will sort its
122-
results, and the :program:`mongos` merges the sort before returning
123-
the result to the client.
124-
125-
What methods are available for administering sharded collections?
126-
-----------------------------------------------------------------
123+
results, and the :program:`mongos` merges each shard's results before returning
124+
them to the client.
127125

128-
All operations available for administration of un-sharded systems are
129-
available for :term:`sharded <sharding>` collections.
130-
131-
How does MongoDB ensure a unique shard key when using a shard key *other* than ``_id``?
126+
How does MongoDB ensure a unique shard key when using a shard key *other* than ``_id``?
132127
----------------------------------------------------------------------------------------
133128

134129
If you do not use ``id`` as the shard key, then your
135130
application/client layer must be responsible for keeping the ``_id``
136-
field unique. It is extremely problematic if collections have
131+
field unique. It is problematic for collections to have
137132
duplicate ``_id`` values.
138133

139-
The current best practice for collects that are not sharded by the
140-
"``_id``" field is to use an identifier that will always be unique,
141-
such as a :wiki:`BSON ObjectID <Object+IDs>` for the ``_id`` field.
134+
If you're not sharding your collection by the
135+
"``_id``" field, then you should be sure to store a globally unique
136+
identifier in that field. The default :wiki:`BSON ObjectID <Object+IDs>`
137+
works well in this case.
138+
139+
I've enabled sharding and added a second shard, but all the data is still on one server. Why?
140+
---------------------------------------------------------------------------------------------
142141

143-
After sharding, why is all the data still on one server?
144-
--------------------------------------------------------
142+
First, ensure that you've declared a :term:`shard key` for your
143+
collection. Until you have configured the shard key, MongoDB will not
144+
create :term:`chunks <chunk>`, and :term:`sharding` will not occur.
145145

146-
Ensure that you have declared a :term:`shard key` for your
147-
collections. Until you have configured the shard key, MongoDB will not
148-
create :term:`chunks <chunk>` and :term:`sharding` will not occur.
146+
Next, keep in mind that the default chunk size is 64 MB,
147+
which means the collection must have at least 64 MB before a
148+
migration will occur.
149149

150-
In the current implementation, the default chunk size is 64 megabytes,
151-
which means the collection must have at least 64 megabytes before a
152-
migration will occur. Additionally, the system which balances chunks
150+
Additionally, the system which balances chunks
153151
among the servers attempts to avoid superfluous migrations. Depending
154-
on the number of shards, your shard key, and the amount of data, your
155-
system may require at least 10 chunks or even 2 gigabytes of data to
156-
trigger migrations.
152+
on the number of shards, your shard key, and the amount of data, systems
153+
often require at least 10 chunks of data to trigger migrations.
157154

158-
:func:`db.printShardingStatus()` reports the number of chunks present
155+
You can run :func:`db.printShardingStatus()` to see all the chunks present
159156
in your cluster.
160157

161158
Is it safe to remove old files in the :dbcommand:`moveChunk` directory?
162159
-----------------------------------------------------------------------
163160

164-
Yes, :program:`mongod` creates these files as backups during normal
161+
Yes. :program:`mongod` creates these files as backups during normal
165162
:term:`shard` balancing operations.
166163

167-
Once these migrations are complete, you may feel free to delete these
168-
files. The cleanup process is currently manual so please do take care
169-
of this to free up space.
164+
Once these migrations are complete, you may delete these
165+
files.
170166

171167
How many connections does each :program:`mongos` need?
172168
------------------------------------------------------
173169

174170
Typically, :program:`mongos` uses one connection from each client, as
175171
well as one outgoing connection to each shard, or each member of the
176-
replica set that backs each shard.
172+
replica set that backs each shard. If you've enabled the ``slaveOk``
173+
bit, then the mongos may create two or more connections per replica set.
177174

178175
Why does :term:`mongos` hold connections?
179176
-----------------------------------------
180177

181-
:program:`mongos` uses a set of connection pools to communicate to
182-
each :term:`shard` or :term:`replica set` backed shard. These pools
183-
of connections do not shrink when the number of clients
184-
decreases.
178+
:program:`mongos` uses a set of connection pools to communicate with
179+
each :term:`shard`. These pools do not shrink when the number of
180+
clients decreases.
185181

186182
This can lead to an unused :program:`mongos` with a large number open
187-
of connections because of past use.
183+
of connections. If the :program:`mongos` is no longer in use, you're
184+
safe restaring the process to close existing connections.
188185

189186
Where does MongoDB report on connections used by :program:`mongos`?
190187
-------------------------------------------------------------------
@@ -196,66 +193,66 @@ run the following command:
196193
197194
db._adminCommand("connPoolStats");
198195
199-
What is ``writebacklisten`` in the log and :func:`currentOp()`?
196+
I'm seeing ``writebacklisten`` in the log. What does this mean?
200197
---------------------------------------------------------------
201198

202-
"Write back listeners" are a component of the communications between
203-
:term:`shards <shard>` and the :term:`config database`. If you see
204-
these operations in the output of :func:`currentOp` or in the "slow"
205-
operations, this is part of the normal operation. The writeback
206-
listener performs long operations by design, so it can appear in the
207-
slow logs even in normal operation.
199+
The writeback listener is a process that opens a long poll to detect
200+
non-safe writes sent to a server and to send them back to the correct
201+
server if necessary.
202+
203+
These messages are a key part of the sharding infrastructure and should
204+
not cause concern.
208205

209206
How should administrators deal with failed migrations?
210207
------------------------------------------------------
211208

212-
Failed migrations require administrative intervention. Chunk moves are
209+
Failed migrations require no administrative intervention. Chunk moves are
213210
consistent and deterministic.
214211

215-
If the migration fails to complete for some reason, the :term:`shard
216-
cluster` will retry. When the migration completes successfully the
212+
If a migration fails to complete for some reason, the :term:`shard
213+
cluster` will retry. When the migration completes successfully, the
217214
data will reside only on the new shard.
218215

219216
What is the process for moving, renaming, or changing the number of config servers?
220217
-----------------------------------------------------------------------------------
221218

222-
.. seealso:: The wiki page that describes this process: ":wiki:`Changing Configuration Servers <Changing+Config+Servers>`."
219+
.. see:: The wiki page that describes this process: ":wiki:`Changing Configuration Servers <Changing+Config+Servers>`."
223220

224-
When do the :program:`mongos` servers pickup config server changes?
221+
When do the :program:`mongos` servers detect config server changes?
225222
-------------------------------------------------------------------
226223

227224
:program:`mongos` instances maintain a cache of the :term:`config database`
228-
that holds the metadata for the :term:`shard cluster`. This meta data
229-
includes :term:`chunk` placement on :term:`shards <shard>`.
225+
that holds the metadata for the :term:`shard cluster`. This metadata
226+
includes the mapping of :term:`chunks <chunk>` to :term:`shards <shard>`.
230227

231-
Periodically, and during specific events, :program:`mongos` updates
232-
this cache. There is not way to control this behavior from the client,
233-
but you can use the :dbcommand:`flushRouterConfig` when logged into a
234-
specific :program:`mongos` to force this instance to reload its
235-
configuration.
228+
:program:`mongos` updates its cache lazily by issuing a request to a shard
229+
and discovering that its metadata is out of date.
230+
There is no way to control this behavior from the client,
231+
but you can run the :dbcommand:`flushRouterConfig` command against any
232+
:program:`mongos` to force force it to refresh its cache.
236233

237234
Is it possible to quickly update :program:`mongos` servers after updating a replica set configuration?
238235
------------------------------------------------------------------------------------------------------
239236

240237
The :program:`mongos` instances will detect these changes without
241238
intervention over time. However, if you want to force the
242-
:program:`mongos` to reload its configuration, use the
243-
:dbcommand:`flushRouterConfig` to each :program:`mongos` directly.
239+
:program:`mongos` to reload its configuration, run the
240+
:dbcommand:`flushRouterConfig` command against to each :program:`mongos` directly.
244241

245-
What does setting ``maxConns`` do on :program:`mongos`?
242+
What does the ``maxConns`` setting on :program:`mongos` do?
246243
-------------------------------------------------------
247244

248245
The :setting:`maxConns` option limits the number of connections
249246
accepted by :program:`mongos`.
250247

251-
If your client driver or application create a large number of
252-
connections but allows them to timeout rather than closing them
248+
If your client driver or application creates a large number of
249+
connections but allows them to time out rather than closing them
253250
explicitly, then it might make sense to limit the number of
254251
connections at the :program:`mongos` layer.
255252

256-
Set :setting:`maxConns` to a value that is slightly higher than the
253+
Set :setting:`maxConns` to a value slightly higher than the
257254
maximum number of connections that the client creates, or the maximum
258255
size of the connection pool. This setting prevents the
259-
:program:`mongos` from sending connection spikes from to the
260-
:term:`shards <shard>`, which can disrupt the operation and memory
261-
allocation of the :term:`shard cluster`.
256+
:program:`mongos` from causing connection spikes on the individual
257+
:term:`shards <shard>`. Spikes like these may disrupt the operation
258+
and memory allocation of the :term:`shard cluster`.

0 commit comments

Comments
 (0)