@@ -540,6 +540,9 @@ Identify replication lag by checking the value of
540
540
using the :method:`rs.status()` function in the :program:`mongo`
541
541
shell.
542
542
543
+ Also, you can monitor how fast replication occurs by watching the oplog
544
+ time in the "replica" graph in MMS.
545
+
543
546
Possible causes of replication lag include:
544
547
545
548
- **Network Latency**
@@ -567,85 +570,48 @@ Possible causes of replication lag include:
567
570
- **Concurrency**
568
571
569
572
In some cases, long-running operations on the primary can block
570
- replication on secondaries. You can use
571
- :term:`write concern` to prevent write operations from returning
572
- if replication cannot keep up with the write load.
573
+ replication on secondaries. You can use :term:`write concern` to
574
+ prevent write operations from returning if replication cannot keep up
575
+ with the write load.
573
576
574
577
Use the :term:`database profiler` to see if there are slow queries
575
578
or long-running operations that correspond to the incidences of lag.
576
579
577
580
- **Oplog Size is Too Small for the Data Load**
578
581
579
- If you perform a large number of writes for a large amount of data
580
-
581
- As commands are sent to the primary, they are recorded in the oplog.
582
- Secondaries update themselves by reading the oplog and applying the
583
- commands. The oplog is a circular buffer. When full, it erases the
584
- oldest commands in order to write new ones. Under times of heavy load,
585
- the contents of the secondaries will lag behind the contents of the
586
- primary. If the replication lag exceeds the amount of time buffered in
587
- the oplog, then the replication cannot continue. Put another way, if
588
- the primary overwrites that command before the secondary has a chance
589
- to apply it, then the replication has failed – there are commands that
590
- have been applied on the primary that the secondary is not able to
591
- apply.
592
-
593
- See the documentation for :doc:`/tutorial/change-oplog-size` for more information.
594
-
595
- - **Read Starvation**
596
-
597
- The secondaries cannot are not able to read the oplog fast enough, and the
598
- oplog writes over old data before the secondaries can read it. This
599
- can happen if you are reading a large amount of data but have not
600
- set the oplog large enough. 10gen recommends an oplog time of
601
- primary was inundated with writes to the point where replication
602
- (the secondaries running queries to get the changes from the oplog)
603
- cannot keep up. This can lead to a lag on the secondaries that
604
- ultimately becomes larger than the oplog on the primary.
605
-
606
- - **Failure to Use Appropriate Write Concern in a High-Write Environment**
582
+ If you do not set your oplog large enough, the oplog overwrites old
583
+ data before the secondaries can read it. The oplog is a circular
584
+ buffer, and when full it erases the oldest commands in order to write
585
+ new ones. If your oplog size is too small, the secondaries reach a
586
+ point where they no longer can access certain updates. The secondaries
587
+ become stale.
607
588
608
- If you perform very large data loads on a regular basis but fail to
609
- set the appropriate write concern, the large volume of write traffic
610
- on the primary will always take precedence over read requests from
611
- secondaries. This will significantly slow replication by severely
612
- reducing the numbers of reads that the secondaries can make on the
613
- oplog in order to update themselves.
589
+ To set oplog size, see :doc:`/tutorial/change-oplog-size`.
614
590
615
- The oplog is circular. When it is full, it begins overwriting the
616
- oldest data with the newest. If the secondaries have not caught up in
617
- their reads, they reach a point where they no longer can access
618
- certain updates. The secondaries become stale.
591
+ - **Failure to Use Appropriate Write Concern in a High-Write Environment**
619
592
620
- To prevent this, use "Write Concern" to tell Mongo to always perform a
621
- safe write after a designated number of inserts, such as after every
622
- 1,000 inserts. This provides a space for the secondaries to catch up with the
623
- primary. Setting a write concern slightly slows down the data load, but it keeps your
624
- secondaries from going stale.
593
+ If the primary is making a very high number of writes and if you have
594
+ not set the appropriate write concern, the secondaries will not be
595
+ able to read the oplog fast enough to keep up with changes. Write
596
+ requests take precedence over read requests, and a very large number
597
+ of writes will significantly reduce the numbers of reads the
598
+ secondaries can make on the oplog in order to update themselves.
599
+
600
+ The replication lag can grow to the point that the oplog overwrites
601
+ commands that the secondaries have not yet read. The oplog is a
602
+ circular buffer, and when full it erases the oldest commands in order
603
+ to write new ones. If the secondaries get too far behind in their
604
+ reads, they reach a point where they no longer can access certain
605
+ updates, and so the secondaries become stale.
606
+
607
+ To prevent this, use "write concern" to tell MongoDB to always perform
608
+ a safe write after a designated number of inserts, such as after every
609
+ 1,000 inserts. This provides a space for the secondaries to catch up
610
+ with the primary. Setting a write concern does slightly slow down the
611
+ data load, but it keeps your secondaries from going stale.
625
612
626
613
See :ref:`replica-set-write-concern` for more information.
627
614
628
- If you do this, and your driver supports it, I recommend that
629
- you use a mode of 'majority'.
630
-
631
- The exact way you use Safe Mode depends on what driver you're using
632
- for your data load program. You can read more about Safe Mode here:
633
-
634
- http://www.mongodb.org/display/DOCS/getLastError+Command
635
- http://www.mongodb.org/display/DOCS/Verifying+Propagation+of+Writes+with+getLastError
636
-
637
-
638
- take precedence over requests from the secondaries to read the oplog and update themselves.
639
- Write requests have priority over read requests. This will significantly
640
-
641
- the read requests from the secondaries from reading the replication data
642
- from the oplog. Secondaries must be able to and significantly slow
643
- down replication to the point that the oplog overwrites commands that
644
- the secondaries have not yet read.
645
-
646
- You can monitor how fast replication occurs by watching the oplog time
647
- in the "replica" graph in MMS.
648
-
649
615
Failover and Recovery
650
616
~~~~~~~~~~~~~~~~~~~~~
651
617
0 commit comments