Skip to content

Commit 018e0ae

Browse files
authored
(DOCSP-49355): Add back two missing files causing errors (#3)
* add back deleted file * add back in code snippet
1 parent 098b64c commit 018e0ae

File tree

2 files changed

+174
-0
lines changed

2 files changed

+174
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
kubectl apply --context "${K8S_CLUSTER_0_CONTEXT_NAME}" -n "${MDB_NAMESPACE}" -f - <<EOF
2+
apiVersion: mongodb.com/v1
3+
kind: MongoDB
4+
metadata:
5+
name: ${RESOURCE_NAME}
6+
spec:
7+
shardCount: 3
8+
# we don't specify mongodsPerShardCount, mongosCount and configServerCount as they don't make sense for multi-cluster
9+
topology: MultiCluster
10+
type: ShardedCluster
11+
version: 8.0.3
12+
opsManager:
13+
configMapRef:
14+
name: mdb-org-project-config
15+
credentials: mdb-org-owner-credentials
16+
persistent: true
17+
externalAccess: {}
18+
security:
19+
certsSecretPrefix: cert-prefix
20+
tls:
21+
ca: ca-issuer
22+
authentication:
23+
enabled: true
24+
modes: ["SCRAM"]
25+
mongos:
26+
clusterSpecList:
27+
- clusterName: ${K8S_CLUSTER_0_CONTEXT_NAME}
28+
members: 2
29+
configSrv:
30+
clusterSpecList:
31+
- clusterName: ${K8S_CLUSTER_0_CONTEXT_NAME}
32+
members: 3 # config server will have 3 members in main cluster
33+
- clusterName: ${K8S_CLUSTER_1_CONTEXT_NAME}
34+
members: 1 # config server will have additional non-voting, read-only member in this cluster
35+
memberConfig:
36+
- votes: 0
37+
priority: "0"
38+
shard:
39+
clusterSpecList:
40+
- clusterName: ${K8S_CLUSTER_0_CONTEXT_NAME}
41+
members: 3 # each shard will have 3 members in this cluster
42+
- clusterName: ${K8S_CLUSTER_1_CONTEXT_NAME}
43+
members: 1 # each shard will have additional non-voting, read-only member in this cluster
44+
memberConfig:
45+
- votes: 0
46+
priority: "0"
47+
EOF
Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
.. _recover-appdb-forced-reconfig:
2+
3+
======================================================================
4+
Recover the {+appdb+} if its Replica Set Loses Majority
5+
======================================================================
6+
7+
.. facet::
8+
:name: genre
9+
:values: tutorial
10+
11+
.. default-domain:: mongodb
12+
13+
.. meta::
14+
:keywords: forced recovery, application database, majority, primary election
15+
:description: How to configure the application database in your Kubernetes deployment to recover and elect a primary if it loses majority.
16+
17+
.. contents:: On this page
18+
:local:
19+
:backlinks: none
20+
:depth: 1
21+
:class: singlecol
22+
23+
In the event that the |k8s| member clusters fail and the {+appdb+}
24+
loses a majority of replica set's nodes available to elect a primary,
25+
the |k8s-op-short| doesn't automatically trigger a forced replica set reconfiguration.
26+
You must manually initiate a forced replica set reconfiguration and restore
27+
the {+appdb+} replica set to a healthy state.
28+
29+
30+
Overview
31+
---------
32+
33+
In certain severe |k8s| cluster outages your {+appdb+}\'s
34+
replica set deployment could lose the majority of the replica set's nodes.
35+
For example, if you have an {+appdb+} deployment with two nodes
36+
in ``cluster 1`` and three nodes in ``cluster 2``, and ``cluster 2`` undergoes
37+
an outage, your {+appdb+}\'s replica set deployment will lose the
38+
node majority needed to elect a primary. Without a primary, the {+mdbagent+}
39+
can't reconfigure a replica set.
40+
41+
To enable rescheduling replica set's nodes, the |k8s-op-short| must forcibly
42+
reconfigure the :opsmgr:`Automation Configuration </reference/api/automation-config/>`
43+
for the {+mdbagent+} to enable deploying replica set nodes in the remaining
44+
healthy member clusters.
45+
To achieve this, the |k8s-op-short| sets
46+
the :opsmgr:`replicaSets[n].force </reference/api/automation-config/automation-config-parameters/#replica-sets>`
47+
flag in the replica set configuration.
48+
The flag instructs the {+mdbagent+} to force a replica set to use the
49+
current (latest) :opsmgr:`Automation Configuration version </reference/api/automation-config/automation-config-parameters/#configuration-version>`.
50+
Using the flag allows the |k8s-op-short| to reconfigure the replica set in
51+
case a primary node isn't elected.
52+
53+
.. important::
54+
55+
Forced reconfiguration of the {+appdb+} can result in undesired behavior,
56+
including :manual:`rollback </core/replica-set-rollbacks/#std-label-replica-set-rollbacks/>`
57+
of :manual:`"majority" </reference/write-concern/#mongodb-writeconcern-writeconcern.-majority-/>`
58+
committed writes, which could lead to an unexpected data loss.
59+
60+
Recover the {+appdb+} through a Forced Reconfiguration
61+
------------------------------------------------------------------
62+
63+
To perform a forced reconfiguration of the {+appdb+}\'s nodes:
64+
65+
1. Change the :opsmgrkube:`spec.applicationDatabase.clusterSpecList` configuration
66+
settings to reconfigure the {+appdb+}\'s deployment on healthy
67+
|k8s| clusters to allow the replica set to form a majority of healthy nodes.
68+
69+
2. Remove failed |k8s| clusters from the :opsmgrkube:`spec.applicationDatabase.clusterSpecList`,
70+
or scale failed |k8s| member clusters down. This way, the replica set
71+
doesn't count the {+appdb+}\'s nodes hosted on those clusters
72+
as voting members of the replica set. For example, having two healthy
73+
nodes in ``cluster 1`` and a failed ``cluster 2`` containing 3 nodes,
74+
you have two healthy nodes from a total of five replica set members
75+
(2/5 healthy). Adding one node to ``cluster 1`` results in having 3/6
76+
ratio of healthy nodes to the number of members in the replica set.
77+
To form a replica set majority, you have the following options:
78+
79+
- Add at least two new replica set nodes to ``cluster 1``, or a new
80+
healthy |k8s| cluster. This achieves a majority (4/7), with four nodes
81+
in a seven-member replica set.
82+
- Scale down a failed |k8s| cluster to zero nodes, or remove the
83+
cluster from the :opsmgrkube:`spec.applicationDatabase.clusterSpecList`
84+
entirely, and add at least one node to ``cluster 1`` to have 3/3
85+
healthy nodes in the replica set's StatefulSet.
86+
87+
3. Add the annotation ``"mongodb.com/v1.forceReconfigure": "true"`` at
88+
the top level of the ``MongoDBOpsManager`` custom resource and ensure
89+
that the value ``"true"`` is a string in quotes.
90+
91+
Based on this annotation, the |k8s-op-short| performs a forced reconfiguration
92+
of the replica set in the next reconciliation process and scales
93+
the {+appdb+}\'s replica set nodes according to the changed
94+
deployment configuration.
95+
96+
The |k8s-op-short| has no means to determine whether the nodes in the
97+
failed |k8s| cluster are healthy. Therefore, if the |k8s-op-short|
98+
can't connect to the failed member |k8s| cluster's API server, the
99+
|k8s-op-short| ignores the cluster during the reconciliation process
100+
of the {+appdb+}\'s replica set nodes.
101+
102+
This means that scaling down of the {+appdb+} nodes removes
103+
failed processes from the replica set configuration.
104+
In cases when only the API server is down, but the replica set's nodes
105+
are running, the |k8s-op-short| doesn't remove the Pods from the failed
106+
|k8s| clusters.
107+
108+
To indicate that it completed the forced reconfiguration, the |k8s-op-short|
109+
adds the annotation key, ``"mongodb.com/v1.forceReconfigurePerformed"``,
110+
with the current timestamp as the value.
111+
112+
.. important::
113+
114+
The |k8s-op-short| performs only one forced reconfiguration of the
115+
replica set. After the replica set reaches a running state, the
116+
|k8s-op-short| adds the ``"mongodb.com/v1.forceReconfigurePerformed"``
117+
annotation to prevent itself from forcing the reconfiguration again
118+
in the future. Therefore, to re-trigger a new forced reconfiguration
119+
event, remove one or both of the following annotations from the resource,
120+
in the :k8sdocs:`metadata.annotations </concepts/overview/working-with-objects/annotations/>`
121+
for the ``MongoDBOpsManager`` custom resource.
122+
123+
- ``"mongodb.com/v1.forceReconfigurePerformed"``
124+
- ``"mongodb.com/v1.forceReconfigure"``
125+
126+
4. Reapply the configuration for the changed ``MongoDBOpsManager`` custom resource
127+
in the |k8s-op-short|.

0 commit comments

Comments
 (0)