From db2b3885f263ba29e24feb4ba2146c2f2a1d81f2 Mon Sep 17 00:00:00 2001 From: Bob Grabar Date: Mon, 27 Aug 2012 14:15:02 -0400 Subject: [PATCH 1/4] DOCS 403 early edits, more to come --- source/administration/replica-sets.txt | 48 +++++++++++++++++++++++--- 1 file changed, 43 insertions(+), 5 deletions(-) diff --git a/source/administration/replica-sets.txt b/source/administration/replica-sets.txt index 1e378edcbe2..96d12a67694 100644 --- a/source/administration/replica-sets.txt +++ b/source/administration/replica-sets.txt @@ -524,25 +524,29 @@ provide good places to start a troubleshooting investigation with .. _replica-set-replication-lag: + + + + Replication Lag ~~~~~~~~~~~~~~~ Replication lag is a delay between an operation on the :term:`primary` -and the application of that operation from :term:`oplog` to the +and the application of that operation from the :term:`oplog` to the :term:`secondary`. Such lag can be a significant issue and can seriously affect MongoDB :term:`replica set` deployments. Excessive replication lag makes "lagged" members ineligible to quickly become primary and increases the possibility that distributed read operations will be inconsistent. -Identify replication lag by checking the values of +Identify replication lag by checking the value of :data:`members[n].optimeDate` for each member of the replica set using the :method:`rs.status()` function in the :program:`mongo` shell. Possible causes of replication lag include: -- **Network Latency.** +- **Network Latency** Check the network routes between the members of your set to ensure that there is no packet loss or network routing issue. @@ -551,7 +555,7 @@ Possible causes of replication lag include: members and ``traceroute`` to expose the routing of packets network endpoints. -- **Disk Throughput.** +- **Disk Throughput** If the file system and disk device on the secondary is unable to flush data to disk as quickly as the primary, then @@ -564,7 +568,7 @@ Possible causes of replication lag include: Use system-level tools to assess disk status, including ``iostat`` or ``vmstat``. -- **Concurrency.** +- **Concurrency** In some cases, long-running operations on the primary can block replication on secondaries. You can use @@ -574,6 +578,40 @@ Possible causes of replication lag include: Use the :term:`database profiler` to see if there are slow queries or long-running operations that correspond to the incidences of lag. +- **The Oplog Size is Too Small** + + As commands are sent to the primary, they are recorded in the oplog. + Secondaries update themselves by reading the oplog and applying the + commands. The oplog is a circular buffer. When full, it erases the + oldest commands to write new ones. The secondaries keep track of the + last oplog command that they read. Under times of heavy load, the + contents of the secondaries will lag behind the contents of the + primary. + + If the replication lag exceeds the amount of time buffered in the + oplog, then the replication cannot continue. Put another way, if the + primary overwrites that command before the secondary has a chance to + apply it, then the replication has failed – there are commands that + have been applied on the primary that the secondary is not able to + apply. + + + + +See http://docs.mongodb.org/manual/tutorial/change-oplog-size/ for more information. + + + + +- **Read Starvation** + +- **Write Starvation** + +- **Failure to Use Appropriate Write Concern in a High-Write Environment** + + + + Failover and Recovery ~~~~~~~~~~~~~~~~~~~~~ From 4f62796b8dd552acc5a8cee96226dd0114e4cb45 Mon Sep 17 00:00:00 2001 From: Bob Grabar Date: Mon, 27 Aug 2012 17:01:44 -0400 Subject: [PATCH 2/4] DOCS-403 ongoing edits, not yet complete --- source/administration/monitoring.txt | 24 +++--- source/administration/replica-sets.txt | 70 ++++++++++++---- source/core/replication-internals.txt | 2 +- source/reference/glossary.txt | 110 +++++++++++++------------ 4 files changed, 122 insertions(+), 84 deletions(-) diff --git a/source/administration/monitoring.txt b/source/administration/monitoring.txt index 9e8b7d00c48..c58bc679462 100644 --- a/source/administration/monitoring.txt +++ b/source/administration/monitoring.txt @@ -339,11 +339,11 @@ This returns all operations that lasted longer than 100 milliseconds. Ensure that the value specified here (i.e. ``100``) is above the :setting:`slowms` threshold. -.. seealso:: The ":wiki:`Optimization`" wiki page addresses strategies +.. seealso:: The :wiki:`Optimization` wiki page addresses strategies that may improve the performance of your database queries and operations. -.. STUB ":doc:`/applications/optimization`" +.. STUB :doc:`/applications/optimization` .. _replica-set-monitoring: @@ -355,7 +355,7 @@ replica sets, beyond the requirements for any MongoDB instance is "replication lag." This refers to the amount of time that it takes a write operation on the :term:`primary` to replicate to a :term:`secondary`. Some very small delay period may be acceptable; -however, as replication lag grows two significant problems emerge: +however, as replication lag grows, two significant problems emerge: - First, operations that have occurred in the period of lag are not replicated to one or more secondaries. If you're using replication @@ -363,22 +363,24 @@ however, as replication lag grows two significant problems emerge: integrity of your data set. - Second, if the replication lag exceeds the length of the operation - log (":term:`oplog`") then the secondary will have to resync all data + log (:term:`oplog`) then the secondary will have to resync all data from the :term:`primary` and rebuild all indexes. In normal circumstances this is uncommon given the typical size of the oplog, - but presents a major problem. + but it's an issue to be aware of. + +For causes of replication lag, see :ref:`Replication Lag `. Replication issues are most often the result of network connectivity -issues between members or a :term:`primary` instance that does not +issues between members or the result of a :term:`primary` that does not have the resources to support application and replication traffic. To -check the status of a replica use the :dbcommand:`replSetGetStatus` or +check the status of a replica, use the :dbcommand:`replSetGetStatus` or the following helper in the shell: .. code-block:: javascript rs.status() -See the ":doc:`/reference/replica-status`" document for a more in +See the :doc:`/reference/replica-status` document for a more in depth overview view of this output. In general watch the value of :status:`optimeDate`. Pay particular attention to the difference in time between the :term:`primary` and the :term:`secondary` members. @@ -393,7 +395,7 @@ option, :program:`mongod` will create an default sized oplog. By default the oplog is 5% of total available disk space on 64-bit systems. -.. seealso:: ":doc:`/tutorial/change-oplog-size`" +.. seealso:: :doc:`/tutorial/change-oplog-size` Sharding and Monitoring ----------------------- @@ -404,10 +406,10 @@ instances. Additionally, shard clusters require monitoring to ensure that data is effectively distributed among nodes and that sharding operations are functioning appropriately. -.. seealso:: See the ":wiki:`Sharding`" wiki page for more +.. seealso:: See the :wiki:`Sharding` wiki page for more information. -.. STUB ":doc:`/core/sharding`" +.. STUB :doc:`/core/sharding` Config Servers ~~~~~~~~~~~~~~ diff --git a/source/administration/replica-sets.txt b/source/administration/replica-sets.txt index 96d12a67694..0d08aa5e180 100644 --- a/source/administration/replica-sets.txt +++ b/source/administration/replica-sets.txt @@ -524,10 +524,6 @@ provide good places to start a troubleshooting investigation with .. _replica-set-replication-lag: - - - - Replication Lag ~~~~~~~~~~~~~~~ @@ -578,39 +574,77 @@ Possible causes of replication lag include: Use the :term:`database profiler` to see if there are slow queries or long-running operations that correspond to the incidences of lag. -- **The Oplog Size is Too Small** +- **Oplog Size is Too Small for the Data Load** + + If you perform a large number of writes for a large amount of data As commands are sent to the primary, they are recorded in the oplog. Secondaries update themselves by reading the oplog and applying the commands. The oplog is a circular buffer. When full, it erases the - oldest commands to write new ones. The secondaries keep track of the - last oplog command that they read. Under times of heavy load, the - contents of the secondaries will lag behind the contents of the - primary. - - If the replication lag exceeds the amount of time buffered in the - oplog, then the replication cannot continue. Put another way, if the - primary overwrites that command before the secondary has a chance to - apply it, then the replication has failed – there are commands that + oldest commands in order to write new ones. Under times of heavy load, + the contents of the secondaries will lag behind the contents of the + primary. If the replication lag exceeds the amount of time buffered in + the oplog, then the replication cannot continue. Put another way, if + the primary overwrites that command before the secondary has a chance + to apply it, then the replication has failed – there are commands that have been applied on the primary that the secondary is not able to apply. + See the documentation for :doc:`/tutorial/change-oplog-size` for more information. +- **Read Starvation** + The secondaries cannot are not able to read the oplog fast enough, and the + oplog writes over old data before the secondaries can read it. This + can happen if you are reading a large amount of data but have not + set the oplog large enough. 10gen recommends an oplog time of + primary was inundated with writes to the point where replication + (the secondaries running queries to get the changes from the oplog) + cannot keep up. This can lead to a lag on the secondaries that + ultimately becomes larger than the oplog on the primary. -See http://docs.mongodb.org/manual/tutorial/change-oplog-size/ for more information. +- **Failure to Use Appropriate Write Concern in a High-Write Environment** + If you perform very large data loads on a regular basis but fail to + set the appropriate write concern, the large volume of write traffic + on the primary will always take precedence over read requests from + secondaries. This will significantly slow replication by severely + reducing the numbers of reads that the secondaries can make on the + oplog in order to update themselves. + The oplog is circular. When it is full, it begins overwriting the + oldest data with the newest. If the secondaries have not caught up in + their reads, they reach a point where they no longer can access + certain updates. The secondaries become stale. + To prevent this, use "Write Concern" to tell Mongo to always perform a + safe write after a designated number of inserts, such as after every + 1,000 inserts. This provides a space for the secondaries to catch up with the + primary. Setting a write concern slightly slows down the data load, but it keeps your + secondaries from going stale. -- **Read Starvation** + See :ref:`replica-set-write-concern` for more information. -- **Write Starvation** +If you do this, and your driver supports it, I recommend that + you use a mode of 'majority'. + + The exact way you use Safe Mode depends on what driver you're using + for your data load program. You can read more about Safe Mode here: + + http://www.mongodb.org/display/DOCS/getLastError+Command + http://www.mongodb.org/display/DOCS/Verifying+Propagation+of+Writes+with+getLastError -- **Failure to Use Appropriate Write Concern in a High-Write Environment** +take precedence over requests from the secondaries to read the oplog and update themselves. + Write requests have priority over read requests. This will significantly +the read requests from the secondaries from reading the replication data + from the oplog. Secondaries must be able to and significantly slow + down replication to the point that the oplog overwrites commands that + the secondaries have not yet read. + You can monitor how fast replication occurs by watching the oplog time + in the "replica" graph in MMS. Failover and Recovery ~~~~~~~~~~~~~~~~~~~~~ diff --git a/source/core/replication-internals.txt b/source/core/replication-internals.txt index 657113b0dd0..1398e365a20 100644 --- a/source/core/replication-internals.txt +++ b/source/core/replication-internals.txt @@ -25,7 +25,7 @@ replicate this log by applying the operations to themselves in an asynchronous process. Under normal operation, :term:`secondary` members reflect writes within one second of the primary. However, various exceptional situations may cause secondaries to lag behind further. See -:term:`replication lag` for details. +:ref:`Replication Lag ` for details. All members send heartbeats (pings) to all other members in the set and can import operations to the local oplog from any other member in the set. diff --git a/source/reference/glossary.txt b/source/reference/glossary.txt index 1e28265830e..df5884570f4 100644 --- a/source/reference/glossary.txt +++ b/source/reference/glossary.txt @@ -14,27 +14,27 @@ Glossary of JSON (JavaScript Object Notation) documents. For a detailed spec, see `bsonspec.org `_. - .. seealso:: The ":ref:`bson-json-type-conversion-fidelity`" + .. seealso:: The :ref:`bson-json-type-conversion-fidelity` section. database command Any MongoDB operation other than an insert, update, remove, or query. MongoDB exposes commands as queries - query against the special ":term:`$cmd`" collection. For + query against the special :term:`$cmd` collection. For example, the implementation of :dbcommand:`count` for MongoDB is a command. - .. seealso:: ":doc:`/reference/commands`" for a full list of + .. seealso:: :doc:`/reference/commands` for a full list of database commands in MongoDB operator A keyword beginning with a ``$`` used to express a complex query, update, or data transformation. For example, ``$gt`` is the query language's "greater than" operator. - See the ":doc:`/reference/operators`" for more + See the :doc:`/reference/operators` for more information about the available operators. - .. seealso:: ":doc:`/reference/operators`." + .. seealso:: :doc:`/reference/operators`. MongoDB The document-based database server described in this manual. @@ -89,7 +89,7 @@ Glossary replication and automated failover. MongoDB's recommended replication strategy. - .. seealso:: ":doc:`/replication`" and ":doc:`/core/replication`." + .. seealso:: :doc:`/replication` and :doc:`/core/replication`. replication A feature allowing multiple database servers to share the same @@ -98,33 +98,33 @@ Glossary and replica sets. .. seealso:: :term:`replica set`, :term:`sharding`, - ":doc:`/replication`." and ":doc:`/core/replication`." + :doc:`/replication`. and :doc:`/core/replication`. shard A single replica set that stores some portion of a shard cluster's total data set. See :term:`sharding`. - .. seealso:: The ":wiki:`Sharding`" wiki page. + .. seealso:: The :wiki:`Sharding` wiki page. - .. STUB ":doc:`/core/sharding`." + .. STUB :doc:`/core/sharding`. sharding A database architecture that enable horizontal scaling by splitting data into key ranges among two or more replica sets. This architecture is also known as "range-based partitioning." See :term:`shard`. - .. seealso:: The ":wiki:`Sharding`" wiki page. + .. seealso:: The :wiki:`Sharding` wiki page. - .. STUB ":doc:`/core/sharding`." + .. STUB :doc:`/core/sharding`. shard cluster The set of nodes comprising a :term:`sharded ` MongoDB deployment. A shard cluster consists of three config processes, one or more replica sets, and one or more :program:`mongos` routing processes. - .. seealso:: The ":wiki:`Sharding`" wiki page. + .. seealso:: The :wiki:`Sharding` wiki page. - .. STUB ":doc:`/core/sharding`." + .. STUB :doc:`/core/sharding`. partition A distributed system architecture that splits data into ranges. @@ -138,14 +138,14 @@ Glossary The program implementing the MongoDB database server. This server typically runs as a :term:`daemon`. - .. seealso:: ":doc:`/reference/mongod`." + .. seealso:: :doc:`/reference/mongod`. mongos The routing and load balancing process that acts an interface between an application and a MongoDB :term:`shard cluster`. - .. seealso:: ":doc:`/reference/mongos`." + .. seealso:: :doc:`/reference/mongos`. mongo The MongoDB Shell. ``mongo`` connects to :program:`mongod` @@ -153,7 +153,7 @@ Glossary management, and testing. :program:`mongo` has a JavaScript interface. - .. seealso:: ":doc:`/reference/mongo`" and ":doc:`/reference/javascript`." + .. seealso:: :doc:`/reference/mongo` and :doc:`/reference/javascript`. cluster A set of :program:`mongod` instances running in @@ -190,9 +190,9 @@ Glossary capped collections. Developers may also use capped collections in their applications. - .. seealso:: The ":wiki:`Capped Collections `" wiki page. + .. seealso:: The :wiki:`Capped Collections ` wiki page. - .. STUB ":doc:`/core/capped-collections`." + .. STUB :doc:`/core/capped-collections`. BSON types The set of types supported by the :term:`BSON` serialization @@ -236,7 +236,7 @@ Glossary primary In a :term:`replica set`, the primary member is the current - ":term:`master`" instance, which receives all write operations. + :term:`master` instance, which receives all write operations. secondary In a :term:`replica set`, the ``secondary`` members are the current @@ -249,7 +249,7 @@ Glossary of the official MongoDB drivers support this convention, as does the ``mongofiles`` program. - .. seealso:: ":doc:`/reference/mongofiles`". + .. seealso:: :doc:`/reference/mongofiles`. md5 ``md5`` is a hashing algorithm used to efficiently provide @@ -262,8 +262,8 @@ Glossary methods in the ``mongo`` shell that provide a more concise syntax and improve the general interactive experience. - .. seealso:: ":doc:`/reference/mongo`" and - ":doc:`/reference/javascript`." + .. seealso:: :doc:`/reference/mongo` and + :doc:`/reference/javascript`. write-lock A lock on the database for a given writer. When a process @@ -273,9 +273,9 @@ Glossary index A data structure that optimizes queries. - .. seealso:: The ":wiki:`Indexes`" wiki page. + .. seealso:: The :wiki:`Indexes` wiki page. - .. STUB ":doc:`/core/indexing`" + .. STUB :doc:`/core/indexing` secondary index An index on any field that isn't the "primary key" for a @@ -284,9 +284,9 @@ Glossary compound index An :term:`index` consisting of two or more keys. - .. seealso:: The ":wiki:`Indexes`" wiki page. + .. seealso:: The :wiki:`Indexes` wiki page. - .. STUB ":doc:`/core/indexing`" + .. STUB :doc:`/core/indexing` btree A data structure used by most database management systems @@ -305,9 +305,9 @@ Glossary to the Journal every 100ms, but this is configurable using the :setting:`journalCommitInterval` runtime option. - .. seealso:: The ":wiki:`Journaling`" wiki page. + .. seealso:: The :wiki:`Journaling` wiki page. - .. STUB ":doc:`/core/journaling`." + .. STUB :doc:`/core/journaling`. pcap A packet capture format used by :program:`mongosniff` to record @@ -408,8 +408,8 @@ Glossary logical writes to a MongoDB database. The oplog is the basic mechanism enabling :term:`replication` in MongoDB. - .. seealso:: ":ref:`Oplog Sizes `" and - ":doc:`/tutorial/change-oplog-size`." + .. seealso:: :ref:`Oplog Sizes ` and + :doc:`/tutorial/change-oplog-size`. control script A simple shell script, typically located in the ``/etc/rc.d`` or @@ -440,8 +440,8 @@ Glossary default :setting:`dbpath` is ``/data/db``. Other common data paths include ``/srv/mongodb`` and ``/var/lib/mongodb``. - .. seealso:: ":setting:`dbpath`" or ":option:`--dbpath - `." + .. seealso:: :setting:`dbpath` or :option:`--dbpath + `. set name In the context of a :term:`replica set`, the ``set name`` refers to @@ -450,8 +450,8 @@ Glossary with the :setting:`replSet` setting (or :option:`--replSet ` option for :program:`mongod`.) - .. seealso:: :term:`replication`, ":doc:`/replication`" and - ":doc:`/core/replication`." + .. seealso:: :term:`replication`, :doc:`/replication` and + :doc:`/core/replication`. _id A field containing a unique ID, typically a BSON :term:`ObjectId`. @@ -492,7 +492,7 @@ Glossary operations in a database's ``system.profile`` collection. The profiler is most often used to diagnose slow queries. - .. seealso:: ":ref:`Monitoring Database Systems `." + .. seealso:: :ref:`Monitoring Database Systems `. shard key In a sharded collection, a shard key is the field that MongoDB @@ -589,23 +589,23 @@ Glossary returning. This often determines how many :term:`replica set` members should propagate a write before returning. - .. seealso:: ":ref:`Write Concern for Replica Sets `." + .. seealso:: :ref:`Write Concern for Replica Sets `. priority In the context of :term:`replica sets `, priority is a configurable value that helps determine which nodes in a replica set are most likely to become :term:`primary`. - .. seealso:: ":ref:`Replica Set Node Priority - `" + .. seealso:: :ref:`Replica Set Node Priority + ` election In the context of :term:`replica sets `, an election is the process by which members of a replica set select primary nodes on startup and in the event of failures. - .. seealso:: ":ref:`Replica Set Elections - `" and ":term:`priority`." + .. seealso:: :ref:`Replica Set Elections + ` and :term:`priority`. hidden member A member of a :term:`replica set` that cannot become primary and @@ -614,7 +614,7 @@ Glossary receiving read-only queries depending on :term:`read preference`. - .. seealso:: ":ref:`Hidden Member `," + .. seealso:: :ref:`Hidden Member `, :dbcommand:`isMaster`, :method:`db.isMaster`, and :data:`members[n].hidden`. @@ -625,13 +625,13 @@ Glossary deleted databases) or updates that have unforeseen effects on the production database. - .. seealso:: ":ref:`Delayed Members `" + .. seealso:: :ref:`Delayed Members ` arbiter A member of a :term:`replica set` that exists solely to vote in :term:`elections `. Arbiter nodes do not replicate data. - .. seealso:: ":ref:`Delayed Nodes `" + .. seealso:: :ref:`Delayed Nodes ` read preference A setting on the MongoDB :doc:`drivers ` @@ -642,7 +642,7 @@ Glossary direct reads to secondary nodes for :term:`eventually consistent ` reads. - .. seealso:: ":ref:`Read Preference `" + .. seealso:: :ref:`Read Preference ` replication lag The length of time between the last operation in the primary's @@ -650,12 +650,14 @@ Glossary :term:`secondary` or :term:`slave` node. In general, you want to keep replication lag as small as possible. + .. seealso:: :ref:`Replication Lag ` + driver A client implementing the communication protocol required for talking to a server. The MongoDB drivers provide language-idiomatic methods for interfacing with MongoDB. - .. seealso:: ":doc:`/applications/drivers`" + .. seealso:: :doc:`/applications/drivers` client The application layer that uses a database for data persistence @@ -667,7 +669,7 @@ Glossary :term:`replica set` to become :term:`primary` in the event of a failure. - .. seealso:: ":ref:`Replica Set Failover `." + .. seealso:: :ref:`Replica Set Failover `. data-center awareness A property that allows clients to address nodes in a system to @@ -676,8 +678,8 @@ Glossary :term:`Replica sets ` implement data-center awareness using :term:`tagging `. - .. seealso:: ":data:`members[n].tags`" and ":ref:`data center - awareness `." + .. seealso:: :data:`members[n].tags` and :ref:`data center + awareness `. tag One or more labels applied to a given replica set member that @@ -700,12 +702,12 @@ Glossary transforms the data. In MongoDB, you can run arbitrary aggregations over data using map-reduce. - .. seealso:: The ":wiki:`Map Reduce `" wiki page for + .. seealso:: The :wiki:`Map Reduce ` wiki page for more information regarding MongoDB's map-reduce - implementation, and ":doc:`/applications/aggregation`" for + implementation, and :doc:`/applications/aggregation` for another approach to data aggregation in MongoDB. - .. STUB ":doc:`/core/map-reduce`" + .. STUB :doc:`/core/map-reduce` RDBMS Relational Database Management System. A database management @@ -734,19 +736,19 @@ Glossary The MongoDB aggregation framework provides a means to calculate aggregate values without having to use :term:`map-reduce`. - .. seealso:: ":doc:`/applications/aggregation`." + .. seealso:: :doc:`/applications/aggregation`. pipeline The series of operations in the :term:`aggregation` process. - .. seealso:: ":doc:`/applications/aggregation`." + .. seealso:: :doc:`/applications/aggregation`. expression In the context of the :term:`aggregation framework`, expressions are the stateless transformations that operate on the data that passes through the :term:`pipeline`. - .. seealso:: ":doc:`/applications/aggregation`." + .. seealso:: :doc:`/applications/aggregation`. accumulator An :term:`expression` in the :term:`aggregation framework` that From f6b8b88ba06b816accc7b4b36875289f7cb104eb Mon Sep 17 00:00:00 2001 From: Bob Grabar Date: Mon, 27 Aug 2012 17:43:54 -0400 Subject: [PATCH 3/4] DOCS-403 added causes of replication lag --- source/administration/replica-sets.txt | 100 ++++++++----------------- 1 file changed, 33 insertions(+), 67 deletions(-) diff --git a/source/administration/replica-sets.txt b/source/administration/replica-sets.txt index 0d08aa5e180..4a4d4d01e74 100644 --- a/source/administration/replica-sets.txt +++ b/source/administration/replica-sets.txt @@ -540,6 +540,9 @@ Identify replication lag by checking the value of using the :method:`rs.status()` function in the :program:`mongo` shell. +Also, you can monitor how fast replication occurs by watching the oplog +time in the "replica" graph in MMS. + Possible causes of replication lag include: - **Network Latency** @@ -567,85 +570,48 @@ Possible causes of replication lag include: - **Concurrency** In some cases, long-running operations on the primary can block - replication on secondaries. You can use - :term:`write concern` to prevent write operations from returning - if replication cannot keep up with the write load. + replication on secondaries. You can use :term:`write concern` to + prevent write operations from returning if replication cannot keep up + with the write load. Use the :term:`database profiler` to see if there are slow queries or long-running operations that correspond to the incidences of lag. - **Oplog Size is Too Small for the Data Load** - If you perform a large number of writes for a large amount of data - - As commands are sent to the primary, they are recorded in the oplog. - Secondaries update themselves by reading the oplog and applying the - commands. The oplog is a circular buffer. When full, it erases the - oldest commands in order to write new ones. Under times of heavy load, - the contents of the secondaries will lag behind the contents of the - primary. If the replication lag exceeds the amount of time buffered in - the oplog, then the replication cannot continue. Put another way, if - the primary overwrites that command before the secondary has a chance - to apply it, then the replication has failed – there are commands that - have been applied on the primary that the secondary is not able to - apply. - - See the documentation for :doc:`/tutorial/change-oplog-size` for more information. - -- **Read Starvation** - - The secondaries cannot are not able to read the oplog fast enough, and the - oplog writes over old data before the secondaries can read it. This - can happen if you are reading a large amount of data but have not - set the oplog large enough. 10gen recommends an oplog time of - primary was inundated with writes to the point where replication - (the secondaries running queries to get the changes from the oplog) - cannot keep up. This can lead to a lag on the secondaries that - ultimately becomes larger than the oplog on the primary. - -- **Failure to Use Appropriate Write Concern in a High-Write Environment** + If you do not set your oplog large enough, the oplog overwrites old + data before the secondaries can read it. The oplog is a circular + buffer, and when full it erases the oldest commands in order to write + new ones. If your oplog size is too small, the secondaries reach a + point where they no longer can access certain updates. The secondaries + become stale. - If you perform very large data loads on a regular basis but fail to - set the appropriate write concern, the large volume of write traffic - on the primary will always take precedence over read requests from - secondaries. This will significantly slow replication by severely - reducing the numbers of reads that the secondaries can make on the - oplog in order to update themselves. + To set oplog size, see :doc:`/tutorial/change-oplog-size`. - The oplog is circular. When it is full, it begins overwriting the - oldest data with the newest. If the secondaries have not caught up in - their reads, they reach a point where they no longer can access - certain updates. The secondaries become stale. +- **Failure to Use Appropriate Write Concern in a High-Write Environment** - To prevent this, use "Write Concern" to tell Mongo to always perform a - safe write after a designated number of inserts, such as after every - 1,000 inserts. This provides a space for the secondaries to catch up with the - primary. Setting a write concern slightly slows down the data load, but it keeps your - secondaries from going stale. + If the primary is making a very high number of writes and if you have + not set the appropriate write concern, the secondaries will not be + able to read the oplog fast enough to keep up with changes. Write + requests take precedence over read requests, and a very large number + of writes will significantly reduce the numbers of reads the + secondaries can make on the oplog in order to update themselves. + + The replication lag can grow to the point that the oplog overwrites + commands that the secondaries have not yet read. The oplog is a + circular buffer, and when full it erases the oldest commands in order + to write new ones. If the secondaries get too far behind in their + reads, they reach a point where they no longer can access certain + updates, and so the secondaries become stale. + + To prevent this, use "write concern" to tell MongoDB to always perform + a safe write after a designated number of inserts, such as after every + 1,000 inserts. This provides a space for the secondaries to catch up + with the primary. Setting a write concern does slightly slow down the + data load, but it keeps your secondaries from going stale. See :ref:`replica-set-write-concern` for more information. -If you do this, and your driver supports it, I recommend that - you use a mode of 'majority'. - - The exact way you use Safe Mode depends on what driver you're using - for your data load program. You can read more about Safe Mode here: - - http://www.mongodb.org/display/DOCS/getLastError+Command - http://www.mongodb.org/display/DOCS/Verifying+Propagation+of+Writes+with+getLastError - - -take precedence over requests from the secondaries to read the oplog and update themselves. - Write requests have priority over read requests. This will significantly - -the read requests from the secondaries from reading the replication data - from the oplog. Secondaries must be able to and significantly slow - down replication to the point that the oplog overwrites commands that - the secondaries have not yet read. - - You can monitor how fast replication occurs by watching the oplog time - in the "replica" graph in MMS. - Failover and Recovery ~~~~~~~~~~~~~~~~~~~~~ From dd12929ebfbec5953a4c7d22a4e908f695c6098a Mon Sep 17 00:00:00 2001 From: Bob Grabar Date: Tue, 28 Aug 2012 10:30:05 -0400 Subject: [PATCH 4/4] DOCS-403 removed bullet on oplog size, and other edits --- source/administration/replica-sets.txt | 48 ++++++++++---------------- 1 file changed, 19 insertions(+), 29 deletions(-) diff --git a/source/administration/replica-sets.txt b/source/administration/replica-sets.txt index 4a4d4d01e74..affc8639860 100644 --- a/source/administration/replica-sets.txt +++ b/source/administration/replica-sets.txt @@ -577,38 +577,28 @@ Possible causes of replication lag include: Use the :term:`database profiler` to see if there are slow queries or long-running operations that correspond to the incidences of lag. -- **Oplog Size is Too Small for the Data Load** - - If you do not set your oplog large enough, the oplog overwrites old - data before the secondaries can read it. The oplog is a circular - buffer, and when full it erases the oldest commands in order to write - new ones. If your oplog size is too small, the secondaries reach a - point where they no longer can access certain updates. The secondaries - become stale. - - To set oplog size, see :doc:`/tutorial/change-oplog-size`. - -- **Failure to Use Appropriate Write Concern in a High-Write Environment** - - If the primary is making a very high number of writes and if you have - not set the appropriate write concern, the secondaries will not be - able to read the oplog fast enough to keep up with changes. Write - requests take precedence over read requests, and a very large number - of writes will significantly reduce the numbers of reads the - secondaries can make on the oplog in order to update themselves. - - The replication lag can grow to the point that the oplog overwrites - commands that the secondaries have not yet read. The oplog is a - circular buffer, and when full it erases the oldest commands in order - to write new ones. If the secondaries get too far behind in their - reads, they reach a point where they no longer can access certain - updates, and so the secondaries become stale. +- **Appropriate Write Concern** + + If you are performing a large data load that requires a very high + number of writes to the primary, and if you have not set the + appropriate write concern, the secondaries will not be able to read + the oplog fast enough to keep up with changes. Write requests take + precedence over read requests, and a very large number of writes will + significantly reduce the numbers of reads the secondaries can make + from the oplog in order to update themselves. + + The replication lag can grow to the point that the oplog over-writes + commands that the secondaries have not yet read. The oplog is a capped + collection, and when full it erases the oldest commands in order to + write new ones. If the secondaries get too far behind in their reads, + they reach a point where they no longer have access to certain + updates, and they become stale. To prevent this, use "write concern" to tell MongoDB to always perform a safe write after a designated number of inserts, such as after every - 1,000 inserts. This provides a space for the secondaries to catch up - with the primary. Setting a write concern does slightly slow down the - data load, but it keeps your secondaries from going stale. + 1,000 inserts. This provides a space for the secondaries to perform + reads and catch up with the primary. Using safe writes slightly slows + down the data load but keeps your secondaries from going stale. See :ref:`replica-set-write-concern` for more information.