From 861b3f4d04aeb6c7d3d3247fb499c37177f9b6fb Mon Sep 17 00:00:00 2001 From: Bob Grabar Date: Mon, 17 Sep 2012 12:05:35 -0400 Subject: [PATCH 1/4] DOCS-437 added oplog error to rs troubleshooting --- source/administration/replica-sets.txt | 37 ++++++++++++++++++++++++++ 1 file changed, 37 insertions(+) diff --git a/source/administration/replica-sets.txt b/source/administration/replica-sets.txt index 8db70382a2d..a6cbfab36c1 100644 --- a/source/administration/replica-sets.txt +++ b/source/administration/replica-sets.txt @@ -652,6 +652,8 @@ Possible causes of replication lag include: Failover and Recovery ~~~~~~~~~~~~~~~~~~~~~ +.. TODO Revisit whether this belongs in troubleshooting. Perhaps this should be an H2 before troubleshooting. + Replica sets feature automated failover. If the :term:`primary` goes offline or becomes unresponsive and a majority of the original set members can still connect to each other, the set will elect a new @@ -695,3 +697,38 @@ You can prevent rollbacks by ensuring safe writes by using the appropriate :term:`write concern`. .. include:: /includes/seealso-elections.rst + +Oplog Entry Error +~~~~~~~~~~~~~~~~~ + +.. TODO link this topic to assertion 13290 once assertion guide exists. + +If you receive the following errors: + +.. code-block:: javascript + + replSet error fatal couldn't query the local local.oplog.rs collection. Terminating mongod after 30 seconds. + bad replSet oplog entry? + +Then the errors might indicate that the ``ts`` field in the last oplog +entry is of the wrong data type. + +You can check this with the following command: + +.. code-block:: javascript + + db.oplog.rs.find({ts:{$type:17}}).({$natural:-1}).limit(1) + +If the ts field is of the wrong type, you receive the same error. + +To verify there are no other issues, run the following command: + + db.oplog.rs.find().({$natural:-1}).limit(1) + +If there are no other issues, this returns without error. + +To fix the ``ts`` data type, run the following: + +.. code-block:: javascript + + db.oplog.rs.update({ts:{t:1234567891000,i:1234}}, {$set:{ts:new Timestamp(1234567891000, 1234)}}) From 30980cc17d38725221e917fd8ae1c9501d5468a4 Mon Sep 17 00:00:00 2001 From: Bob Grabar Date: Tue, 18 Sep 2012 13:49:14 -0400 Subject: [PATCH 2/4] DOCS-437 troubleshooting an oplog error --- source/administration/replica-sets.txt | 48 +++++++++++++++++++------- 1 file changed, 36 insertions(+), 12 deletions(-) diff --git a/source/administration/replica-sets.txt b/source/administration/replica-sets.txt index a6cbfab36c1..58078db2eb3 100644 --- a/source/administration/replica-sets.txt +++ b/source/administration/replica-sets.txt @@ -698,36 +698,60 @@ the appropriate :term:`write concern`. .. include:: /includes/seealso-elections.rst -Oplog Entry Error -~~~~~~~~~~~~~~~~~ +Oplog Entry Timestamp Error +~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. TODO link this topic to assertion 13290 once assertion guide exists. -If you receive the following errors: +If you receive the following error: .. code-block:: javascript replSet error fatal couldn't query the local local.oplog.rs collection. Terminating mongod after 30 seconds. bad replSet oplog entry? -Then the errors might indicate that the ``ts`` field in the last oplog -entry is of the wrong data type. +Then the value for the ``ts`` field in the last oplog entry might be of +the wrong data type. The correct data type is Timestamp. -You can check this with the following command: +You can check the data type by running the following two queries. If the +data type is correct, the queries return the same document; if +incorrect, they return different documents. + +First run a query to return the last document in the oplog, no matter +its data type: + +.. code-block:: javascript + + db.oplog.rs.find().sort({$natural:-1}).limit(1) + +Then run a query to return the last document in the oplog where the +``ts`` value is a Timestamp. Use the :operator:`$type` operator to query +for type ``17``, which is the Timestamp data type. .. code-block:: javascript - db.oplog.rs.find({ts:{$type:17}}).({$natural:-1}).limit(1) + db.oplog.rs.find({ts:{$type:17}}).sort({$natural:-1}).limit(1) + +.. example:: + + As an example, if the first query returns this as the last oplog entry: + + .. code-block:: javascript + + { "h" : NumberLong("8191276672478122996"), "ns" : "", "o" : { "msg" : "Reconfig set", "version" : 4 }, "op" : "n", "ts" : true } -If the ts field is of the wrong type, you receive the same error. + And the second query returns this as the last entry where ``ts`` is a Timestamp: -To verify there are no other issues, run the following command: + .. code-block:: javascript - db.oplog.rs.find().({$natural:-1}).limit(1) + { "ts" : Timestamp(1347982454000, 1), "h" : NumberLong("6188469075153256465"), "op" : "n", "ns" : "", "o" : { "msg" : "Reconfig set", "version" : 3 } } -If there are no other issues, this returns without error. + Then the value for the ``ts`` field in the last oplog entry is of the + wrong data type. -To fix the ``ts`` data type, run the following: +To fix the ``ts`` data type, you can run the following update. Note, +however, that this update scans the whole oplog and can take a lot of +time to pull the oplog into memory: .. code-block:: javascript From aaee068078c81ae80d9adddea9de1f717a0b9166 Mon Sep 17 00:00:00 2001 From: Bob Grabar Date: Wed, 19 Sep 2012 10:05:40 -0400 Subject: [PATCH 3/4] DOCS-437 edit to the error message --- source/administration/replica-sets.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/source/administration/replica-sets.txt b/source/administration/replica-sets.txt index 58078db2eb3..bc58863867e 100644 --- a/source/administration/replica-sets.txt +++ b/source/administration/replica-sets.txt @@ -708,7 +708,7 @@ If you receive the following error: .. code-block:: javascript replSet error fatal couldn't query the local local.oplog.rs collection. Terminating mongod after 30 seconds. - bad replSet oplog entry? + [rsStart] bad replSet oplog entry? Then the value for the ``ts`` field in the last oplog entry might be of the wrong data type. The correct data type is Timestamp. From d4771f9b4326baee723a6201ad81d8f083ef7e8c Mon Sep 17 00:00:00 2001 From: Bob Grabar Date: Wed, 19 Sep 2012 13:51:12 -0400 Subject: [PATCH 4/4] DOCS-437 troubleshooting an oplog error: final edits --- source/administration/replica-sets.txt | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/source/administration/replica-sets.txt b/source/administration/replica-sets.txt index bc58863867e..512369d819f 100644 --- a/source/administration/replica-sets.txt +++ b/source/administration/replica-sets.txt @@ -713,12 +713,11 @@ If you receive the following error: Then the value for the ``ts`` field in the last oplog entry might be of the wrong data type. The correct data type is Timestamp. -You can check the data type by running the following two queries. If the +You can check the data type by running the following two queries against the oplog. If the data type is correct, the queries return the same document; if incorrect, they return different documents. -First run a query to return the last document in the oplog, no matter -its data type: +First run a query to return the last document in the oplog: .. code-block:: javascript @@ -732,13 +731,16 @@ for type ``17``, which is the Timestamp data type. db.oplog.rs.find({ts:{$type:17}}).sort({$natural:-1}).limit(1) +If the queries don't return the same document, then the last document in +the oplog has the wrong data type in the ``ts`` field. + .. example:: As an example, if the first query returns this as the last oplog entry: .. code-block:: javascript - { "h" : NumberLong("8191276672478122996"), "ns" : "", "o" : { "msg" : "Reconfig set", "version" : 4 }, "op" : "n", "ts" : true } + { "ts" : {t: 1347982456000, i: 1}, "h" : NumberLong("8191276672478122996"), "op" : "n", "ns" : "", "o" : { "msg" : "Reconfig set", "version" : 4 } } And the second query returns this as the last entry where ``ts`` is a Timestamp: @@ -755,4 +757,4 @@ time to pull the oplog into memory: .. code-block:: javascript - db.oplog.rs.update({ts:{t:1234567891000,i:1234}}, {$set:{ts:new Timestamp(1234567891000, 1234)}}) + db.oplog.rs.update({ts:{t:1347982456000,i:1}}, {$set:{ts:new Timestamp(1347982456000, 1)}})