-
Notifications
You must be signed in to change notification settings - Fork 81
Description
This is only for a sharded cluster, and could occur frequently for small sharded clusters.
So the current process is:
start tailed oplogs
start mongodumps of each replicaset with --oplog
finish mongodumps
stop tailed oplogs
Then the evaluation is made using Resolver.get_consistent_end_ts()
which looks at the tailed oplogs, and then appends the oplog entries from the tailed oplogs to the mongodump oplogs.
However, in its current state, it could be common that the backups taken are not point in time.
When I was testing, I found that the optime on the configservers update a lot less frequently. In fact, it looks like it can be every 10 seconds:
configReplSet:PRIMARY> db.oplog.rs.find({}, {'ts': 1, 'op': 1, 'o': 1}).sort({$natural:-1})
{ "ts" : Timestamp(1496305850, 1), "op" : "u", "o" : { "$set" : { "ping" : ISODate("2017-06-01T08:30:50.770Z") } } }
{ "ts" : Timestamp(1496305848, 2), "op" : "u", "o" : { "$set" : { "ping" : ISODate("2017-06-01T08:30:48.833Z"), "waiting" : true } } }
{ "ts" : Timestamp(1496305848, 1), "op" : "u", "o" : { "$set" : { "ping" : ISODate("2017-06-01T08:30:48.830Z"), "up" : NumberLong(8444197), "waiting" : false } } }
{ "ts" : Timestamp(1496305839, 1), "op" : "u", "o" : { "$set" : { "ping" : ISODate("2017-06-01T08:30:39.157Z") } } }
{ "ts" : Timestamp(1496305838, 2), "op" : "u", "o" : { "$set" : { "ping" : ISODate("2017-06-01T08:30:38.823Z"), "waiting" : true } } }
{ "ts" : Timestamp(1496305838, 1), "op" : "u", "o" : { "$set" : { "ping" : ISODate("2017-06-01T08:30:38.817Z"), "up" : NumberLong(8444187), "waiting" : false } } }
{ "ts" : Timestamp(1496305828, 3), "op" : "u", "o" : { "$set" : { "ping" : ISODate("2017-06-01T08:30:28.961Z") } } }
{ "ts" : Timestamp(1496305828, 2), "op" : "u", "o" : { "$set" : { "ping" : ISODate("2017-06-01T08:30:28.809Z"), "waiting" : true } } }
{ "ts" : Timestamp(1496305828, 1), "op" : "u", "o" : { "$set" : { "ping" : ISODate("2017-06-01T08:30:28.804Z"), "up" : NumberLong(8444177), "waiting" : false } } }
{ "ts" : Timestamp(1496305820, 1), "op" : "u", "o" : { "$set" : { "ping" : ISODate("2017-06-01T08:30:20.765Z") } } }
{ "ts" : Timestamp(1496305818, 2), "op" : "u", "o" : { "$set" : { "ping" : ISODate("2017-06-01T08:30:18.800Z"), "waiting" : true } } }
{ "ts" : Timestamp(1496305818, 1), "op" : "u", "o" : { "$set" : { "ping" : ISODate("2017-06-01T08:30:18.796Z"), "up" : NumberLong(8444167), "waiting" : false } } }
{ "ts" : Timestamp(1496305809, 1), "op" : "u", "o" : { "$set" : { "ping" : ISODate("2017-06-01T08:30:09.151Z") } } }
{ "ts" : Timestamp(1496305808, 2), "op" : "u", "o" : { "$set" : { "ping" : ISODate("2017-06-01T08:30:08.789Z"), "waiting" : true } } }
{ "ts" : Timestamp(1496305808, 1), "op" : "u", "o" : { "$set" : { "ping" : ISODate("2017-06-01T08:30:08.782Z"), "up" : NumberLong(8444157), "waiting" : false } } }
{ "ts" : Timestamp(1496305798, 3), "op" : "u", "o" : { "$set" : { "ping" : ISODate("2017-06-01T08:29:58.955Z") } } }
{ "ts" : Timestamp(1496305798, 2), "op" : "u", "o" : { "$set" : { "ping" : ISODate("2017-06-01T08:29:58.776Z"), "waiting" : true } } }
{ "ts" : Timestamp(1496305798, 1), "op" : "u", "o" : { "$set" : { "ping" : ISODate("2017-06-01T08:29:58.771Z"), "up" : NumberLong(8444147), "waiting" : false } } }
{ "ts" : Timestamp(1496305790, 1), "op" : "u", "o" : { "$set" : { "ping" : ISODate("2017-06-01T08:29:50.764Z") } } }
{ "ts" : Timestamp(1496305788, 2), "op" : "u", "o" : { "$set" : { "ping" : ISODate("2017-06-01T08:29:48.770Z"), "waiting" : true } } }
This becomes a problem because...
The last mongodump that finishes, it's oplog time would be near the current time. But when they finish, the oplog tailers are killed off almost immediately, which means, a config server oplog tailer may have a last_ts
of up to 10 seconds ago.
When the evaluation for get_consistent_end_ts()
is made the ResolverThread
then has a max_end_ts
of the config server's last_ts
. This means, tailed oplog entries are appended to the mongodump oplog only if they're prior to this max_end_ts
, however, one of the mongodump's oplog already exceeds the max_end_ts
. This leaves one mongodump with a higher oplog time than the others.
I was testing this by adding around 1000 documents per second against the mongos for one of the sharded collections, and then comparing the documents.
I can work around this by adding a sleep after the mongo dumps finish, given the tailed oplog for the configserver a chance to write it's next timestamp, of which a point in time backup is then made.
The graceful fix for this would be:
Once the mongodumps finish, take the last oplog timestamp from the mongodumps, and then ensure the oplog tailers exceed this. This way, the get_consistent_end_ts()
would then be correct.
Please feel free to ask if you need any further information.
Thanks,