|
| 1 | +.. _recover-appdb-forced-reconfig: |
| 2 | + |
| 3 | +====================================================================== |
| 4 | +Recover the {+appdb+} if its Replica Set Loses Majority |
| 5 | +====================================================================== |
| 6 | + |
| 7 | +.. facet:: |
| 8 | + :name: genre |
| 9 | + :values: tutorial |
| 10 | + |
| 11 | +.. default-domain:: mongodb |
| 12 | + |
| 13 | +.. meta:: |
| 14 | + :keywords: forced recovery, application database, majority, primary election |
| 15 | + :description: How to configure the application database in your Kubernetes deployment to recover and elect a primary if it loses majority. |
| 16 | + |
| 17 | +.. contents:: On this page |
| 18 | + :local: |
| 19 | + :backlinks: none |
| 20 | + :depth: 1 |
| 21 | + :class: singlecol |
| 22 | + |
| 23 | +In the event that the |k8s| member clusters fail and the {+appdb+} |
| 24 | +loses a majority of replica set's nodes available to elect a primary, |
| 25 | +the |k8s-op-short| doesn't automatically trigger a forced replica set reconfiguration. |
| 26 | +You must manually initiate a forced replica set reconfiguration and restore |
| 27 | +the {+appdb+} replica set to a healthy state. |
| 28 | + |
| 29 | + |
| 30 | +Overview |
| 31 | +--------- |
| 32 | + |
| 33 | +In certain severe |k8s| cluster outages your {+appdb+}\'s |
| 34 | +replica set deployment could lose the majority of the replica set's nodes. |
| 35 | +For example, if you have an {+appdb+} deployment with two nodes |
| 36 | +in ``cluster 1`` and three nodes in ``cluster 2``, and ``cluster 2`` undergoes |
| 37 | +an outage, your {+appdb+}\'s replica set deployment will lose the |
| 38 | +node majority needed to elect a primary. Without a primary, the {+mdbagent+} |
| 39 | +can't reconfigure a replica set. |
| 40 | + |
| 41 | +To enable rescheduling replica set's nodes, the |k8s-op-short| must forcibly |
| 42 | +reconfigure the :opsmgr:`Automation Configuration </reference/api/automation-config/>` |
| 43 | +for the {+mdbagent+} to enable deploying replica set nodes in the remaining |
| 44 | +healthy member clusters. |
| 45 | +To achieve this, the |k8s-op-short| sets |
| 46 | +the :opsmgr:`replicaSets[n].force </reference/api/automation-config/automation-config-parameters/#replica-sets>` |
| 47 | +flag in the replica set configuration. |
| 48 | +The flag instructs the {+mdbagent+} to force a replica set to use the |
| 49 | +current (latest) :opsmgr:`Automation Configuration version </reference/api/automation-config/automation-config-parameters/#configuration-version>`. |
| 50 | +Using the flag allows the |k8s-op-short| to reconfigure the replica set in |
| 51 | +case a primary node isn't elected. |
| 52 | + |
| 53 | +.. important:: |
| 54 | + |
| 55 | + Forced reconfiguration of the {+appdb+} can result in undesired behavior, |
| 56 | + including :manual:`rollback </core/replica-set-rollbacks/#std-label-replica-set-rollbacks/>` |
| 57 | + of :manual:`"majority" </reference/write-concern/#mongodb-writeconcern-writeconcern.-majority-/>` |
| 58 | + committed writes, which could lead to an unexpected data loss. |
| 59 | + |
| 60 | +Recover the {+appdb+} through a Forced Reconfiguration |
| 61 | +------------------------------------------------------------------ |
| 62 | + |
| 63 | +To perform a forced reconfiguration of the {+appdb+}\'s nodes: |
| 64 | + |
| 65 | +1. Change the :opsmgrkube:`spec.applicationDatabase.clusterSpecList` configuration |
| 66 | + settings to reconfigure the {+appdb+}\'s deployment on healthy |
| 67 | + |k8s| clusters to allow the replica set to form a majority of healthy nodes. |
| 68 | + |
| 69 | +2. Remove failed |k8s| clusters from the :opsmgrkube:`spec.applicationDatabase.clusterSpecList`, |
| 70 | + or scale failed |k8s| member clusters down. This way, the replica set |
| 71 | + doesn't count the {+appdb+}\'s nodes hosted on those clusters |
| 72 | + as voting members of the replica set. For example, having two healthy |
| 73 | + nodes in ``cluster 1`` and a failed ``cluster 2`` containing 3 nodes, |
| 74 | + you have two healthy nodes from a total of five replica set members |
| 75 | + (2/5 healthy). Adding one node to ``cluster 1`` results in having 3/6 |
| 76 | + ratio of healthy nodes to the number of members in the replica set. |
| 77 | + To form a replica set majority, you have the following options: |
| 78 | + |
| 79 | + - Add at least two new replica set nodes to ``cluster 1``, or a new |
| 80 | + healthy |k8s| cluster. This achieves a majority (4/7), with four nodes |
| 81 | + in a seven-member replica set. |
| 82 | + - Scale down a failed |k8s| cluster to zero nodes, or remove the |
| 83 | + cluster from the :opsmgrkube:`spec.applicationDatabase.clusterSpecList` |
| 84 | + entirely, and add at least one node to ``cluster 1`` to have 3/3 |
| 85 | + healthy nodes in the replica set's StatefulSet. |
| 86 | + |
| 87 | +3. Add the annotation ``"mongodb.com/v1.forceReconfigure": "true"`` at |
| 88 | + the top level of the ``MongoDBOpsManager`` custom resource and ensure |
| 89 | + that the value ``"true"`` is a string in quotes. |
| 90 | + |
| 91 | + Based on this annotation, the |k8s-op-short| performs a forced reconfiguration |
| 92 | + of the replica set in the next reconciliation process and scales |
| 93 | + the {+appdb+}\'s replica set nodes according to the changed |
| 94 | + deployment configuration. |
| 95 | + |
| 96 | + The |k8s-op-short| has no means to determine whether the nodes in the |
| 97 | + failed |k8s| cluster are healthy. Therefore, if the |k8s-op-short| |
| 98 | + can't connect to the failed member |k8s| cluster's API server, the |
| 99 | + |k8s-op-short| ignores the cluster during the reconciliation process |
| 100 | + of the {+appdb+}\'s replica set nodes. |
| 101 | + |
| 102 | + This means that scaling down of the {+appdb+} nodes removes |
| 103 | + failed processes from the replica set configuration. |
| 104 | + In cases when only the API server is down, but the replica set's nodes |
| 105 | + are running, the |k8s-op-short| doesn't remove the Pods from the failed |
| 106 | + |k8s| clusters. |
| 107 | + |
| 108 | + To indicate that it completed the forced reconfiguration, the |k8s-op-short| |
| 109 | + adds the annotation key, ``"mongodb.com/v1.forceReconfigurePerformed"``, |
| 110 | + with the current timestamp as the value. |
| 111 | + |
| 112 | + .. important:: |
| 113 | + |
| 114 | + The |k8s-op-short| performs only one forced reconfiguration of the |
| 115 | + replica set. After the replica set reaches a running state, the |
| 116 | + |k8s-op-short| adds the ``"mongodb.com/v1.forceReconfigurePerformed"`` |
| 117 | + annotation to prevent itself from forcing the reconfiguration again |
| 118 | + in the future. Therefore, to re-trigger a new forced reconfiguration |
| 119 | + event, remove one or both of the following annotations from the resource, |
| 120 | + in the :k8sdocs:`metadata.annotations </concepts/overview/working-with-objects/annotations/>` |
| 121 | + for the ``MongoDBOpsManager`` custom resource. |
| 122 | + |
| 123 | + - ``"mongodb.com/v1.forceReconfigurePerformed"`` |
| 124 | + - ``"mongodb.com/v1.forceReconfigure"`` |
| 125 | + |
| 126 | +4. Reapply the configuration for the changed ``MongoDBOpsManager`` custom resource |
| 127 | + in the |k8s-op-short|. |
0 commit comments