diff --git a/source/administration/replica-sets.txt b/source/administration/replica-sets.txt index 5a4ca794750..4b892cb749b 100644 --- a/source/administration/replica-sets.txt +++ b/source/administration/replica-sets.txt @@ -33,6 +33,7 @@ suggestions for administers of replica sets. - :doc:`/tutorial/change-hostnames-in-a-replica-set` - :doc:`/tutorial/convert-secondary-into-arbiter` - :doc:`/tutorial/reconfigure-replica-set-with-unavailable-members` + - :doc:`/tutorial/recover-data-following-unexpected-shutdown` .. _replica-set-node-configurations: .. _replica-set-member-configurations: @@ -365,7 +366,8 @@ the following to prepare the new member's :term:`data directory `: difference in the amount of time between the most recent operation and the most recent operation to the database exceeds the length of the :term:`oplog` on the existing members, then the new instance will have - to completely re-synchronize. + to completely resynchronize, as described in + :ref:`replica-set-resync-stale-member`. Use :method:`db.printReplicationInfo()` to check the current state of replica set members with regards to the oplog. @@ -558,6 +560,89 @@ the oplog. For a detailed procedure, see .. include:: /includes/procedure-change-oplog-size.rst +.. _replica-set-resync-stale-member: + +Resyncing a Member of a Replica Set +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +When a member's data falls too far behind the :term:`oplog` to catch up, +the member and it's data are considered "stale". A member's data is too +far behind when the oplog on the :term:`primary` has overwritten its +entries before the member has copied them. When that occurs, you must +resync the member by removing its data and replacing it with up-to-date +data. + +To do so, use one of the following approaches: + +- Restart the :program:`mongod` with an empty data directory and let MongoDB's + automatic syncing feature restore the data. This approach requires + fewer steps but can take longer to replace the data. + + See :ref:`replica-set-auto-resync-stale-member`. + +- Restart the machine with a copy of a recent data directory from + another member in the :term:`replica set`. This procedure can replace + the data more quickly but requires more manual steps. + + See :ref:`replica-set-resync-by-copying`. + +.. index:: replica set; resync +.. _replica-set-auto-resync-stale-member: + +Automatically Resync a Stale Member +``````````````````````````````````` + +This procedure relies on MongoDB's automatic syncing feature to restore +the data on the stale member. For an overview of how MongoDB syncs +:term:`replica sets `, see :ref:`replica-set-syncing`. + +To resync the stale member: + +1. Stop the member's :program:`mongod` instance using the + :option:`mongod --shutdown` option. Make sure to set + :option:`--dbpath ` to the member's data directory. + + .. code-block:: sh + + mongod --dbpath /data/db/ --shutdown + +#. Delete all data and subdirectories from the member's data directory + such that the directory is empty. + +#. Restart the :program:`mongod` instance on the member. Consider the + following example: + + .. code-block:: sh + + mongod --dbpath /data/db/ --replSet rsProduction + + MongoDB resyncs the member. Resyncing may take a long time, depending on + the size of the database and speed of the network. Also, + this puts a load on the member being synced from. That + member might not be able to keep a working set in memory. + +.. index:: replica set; resync +.. _replica-set-resync-by-copying: + +Resync by Copying Data from Another Member +`````````````````````````````````````````` + +This approach uses the data directory of an existing member to "seed" +the stale member. The data must be recent enough to allow the new member +to catch up with the :term:`oplog`. + +To resync by copying data from another member, use one of the following +approaches: + +- Create a snapshot of another member's data and then restore that + snapshot to the stale member. Use the snapshot procedures in + :doc:`/administration/backups`. + +- Lock another member's data with the :method:`db.fsyncLock()` + command, copy all of the data in the data directory, and then restore the data to the stale + member. Use the procedures for backup storage in + :doc:`/administration/backups`. + .. _replica-set-security: Replica Set Security diff --git a/source/core/replication-internals.txt b/source/core/replication-internals.txt index 7a624c2af0f..af174131f5d 100644 --- a/source/core/replication-internals.txt +++ b/source/core/replication-internals.txt @@ -4,9 +4,6 @@ Replication Internals .. default-domain:: mongodb -Synopsis --------- - This document provides a more in-depth explanation of the internals and operation of :term:`replica set` features. This material is not necessary for normal operation or application development but may be useful for @@ -77,11 +74,10 @@ the following collections: .. _replica-set-oplog: .. _replica-set-internals-oplog: -Oplog ------ +Oplog Internals +--------------- -For an explanation of the oplog, see the :ref:`replica-set-oplog-sizing` -topic in the :doc:`/core/replication` document. +For an explanation of the oplog, see :ref:`replica-set-oplog-sizing`. Under various exceptional situations, updates to a :term:`secondary's ` oplog might @@ -113,8 +109,8 @@ Data Integrity .. index:: replica set; read preferences -Read Preferences -~~~~~~~~~~~~~~~~ +Read Preference Internals +~~~~~~~~~~~~~~~~~~~~~~~~~ MongoDB uses :term:`single-master replication` to ensure that the database remains consistent. However, clients may modify the @@ -172,8 +168,8 @@ for your data set is crucial. .. index:: replica set; security -Security --------- +Security Internals +------------------ Administrators of replica sets also have unique :ref:`monitoring ` and :ref:`security ` @@ -188,8 +184,8 @@ modify the configuration of an existing replica set. .. index:: replica set; failover .. _replica-set-election-internals: -Elections ---------- +Election Internals +------------------ Elections are the process :term:`replica set` members use to select which member should become :term:`primary`. A primary is the only member in the replica @@ -297,6 +293,8 @@ and a majority of servers in one data center and one server in another. .. index:: replica set; sync +.. _replica-set-syncing: + Syncing ------- @@ -327,3 +325,5 @@ For example: alternate facility, and if you add another secondary to the alternate facility, the new secondary will likely sync from the existing secondary because it is closer than the primary. + +.. seealso:: :ref:`replica-set-resync-stale-member` diff --git a/source/core/replication.txt b/source/core/replication.txt index 32300a4a739..878ecb4748d 100644 --- a/source/core/replication.txt +++ b/source/core/replication.txt @@ -353,9 +353,13 @@ activity of your MongoDB-based application are reads and you are writing a small amount of data, you may find that you need a much smaller oplog. -For a further understanding of oplog behavior, see the -:ref:`replica-set-oplog` topic in the :doc:`/core/replication-internals` -document. +To view oplog status, including the size and the time range of +operations, issue the :method:`db.printReplicationInfo()` method. For +more information on oplog status, see +:ref:`replica-set-troubleshooting-check-oplog-size`. + +For an advanced understanding of oplog behavior, see +ref:`replica-set-oplog` and :ref:`replica-set-syncing`. Replica Set Deployment ~~~~~~~~~~~~~~~~~~~~~~ diff --git a/source/replication.txt b/source/replication.txt index 8a9962f7cea..e2565b9f01f 100644 --- a/source/replication.txt +++ b/source/replication.txt @@ -56,6 +56,7 @@ operations in detail: tutorial/change-hostnames-in-a-replica-set tutorial/convert-secondary-into-arbiter tutorial/reconfigure-replica-set-with-unavailable-members + tutorial/recover-data-following-unexpected-shutdown .. _replication-reference: diff --git a/source/tutorial/reconfigure-replica-set-with-unavailable-members.txt b/source/tutorial/reconfigure-replica-set-with-unavailable-members.txt index 9aa44974d86..2801552f2fe 100644 --- a/source/tutorial/reconfigure-replica-set-with-unavailable-members.txt +++ b/source/tutorial/reconfigure-replica-set-with-unavailable-members.txt @@ -1,6 +1,6 @@ -================================================== -Reconfigure a Replica Set with Unavailable Members -================================================== +=============================================== +Reconfigure a Replica Set when Members are Down +=============================================== .. default-domain:: mongodb @@ -23,9 +23,6 @@ members can reach a majority. See :ref:`replica-set-elections-and-network-partitions` for more information on this situation. -This document provides the following options for reconfiguring a replica -set when a **majority** of members are accessible: - .. index:: replica set; reconfiguration .. _replica-set-force-reconfiguration: diff --git a/source/tutorial/recover-data-following-unexpected-shutdown.txt b/source/tutorial/recover-data-following-unexpected-shutdown.txt index c40db97a63e..4ac24cf21e4 100644 --- a/source/tutorial/recover-data-following-unexpected-shutdown.txt +++ b/source/tutorial/recover-data-following-unexpected-shutdown.txt @@ -9,21 +9,22 @@ representation of the data files will likely reflect an inconsistent state which could lead to data corruption. To prevent data inconsistency and corruption, always shut down the -database cleanly, and use the :ref:`durability journaling +database cleanly and use the :ref:`durability journaling `. The journal writes data to disk every 100 -milliseconds by default, and ensures that MongoDB will be able to +milliseconds by default and ensures that MongoDB can recover to a consistent state even in the case of an unclean shutdown due to power loss or other system failure. If you are *not* running as part of a :term:`replica set` **and** do -*not* have journaling enabled use the following procedure to recover +*not* have journaling enabled, use the following procedure to recover data that may be in an inconsistent state. If you are running as part of a replica set, you should *always* restore from a backup or restart the :program:`mongod` instance with an empty :setting:`dbpath` and allow MongoDB to resync the data. -.. seealso:: The ":doc:`/administration`" documents and the - documentation of the :setting:`repair`, :setting:`repairpath`, and +.. seealso:: The :doc:`/administration` documents, including + :ref:`replica-set-syncing`, and the + documentation on the :setting:`repair`, :setting:`repairpath`, and :setting:`journal` settings. .. [#clean-shutdown] To ensure a clean shut down, use the @@ -41,7 +42,7 @@ When you are aware of a :program:`mongod` instance running without journaling that stops unexpectedly **and** you're not running with replication, you should always run the repair operation before starting MongoDB again. If you're using replication, then restore from -a backup and allow replication to synchronize your data. +a backup and allow replication to :ref:`synchronize ` your data. If the ``mongod.lock`` file in the data directory specified by :setting:`dbpath`, ``/data/db`` by default, is *not* a zero-byte file, @@ -72,7 +73,7 @@ Overview Do not use this procedure to recover a member of a :term:`replica set`. Instead you should either restore from a :doc:`backup ` - or re-sync from an intact member of the set. + or resync from an intact member of the set, as described in :ref:`replica-set-resync-stale-member`. There are two processes to repair data files that result from an unexpected shutdown: @@ -171,4 +172,4 @@ If you are not running with journaling, and your database shuts down unexpectedly for *any* reason, you should always proceed *as if* your database is in an inconsistent and likely corrupt state. If at all possible restore from :doc:`backup ` or if running as a :term:`replica -set` re-sync from an intact member of the set. +set` resync from an intact member of the set, as described in :ref:`replica-set-resync-stale-member`.