You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
MongoDB provides high availability by having multiple copies of data in replica sets.
56
-
Members of the same replica set don't share the same resources. For example, members
57
-
of the same replica set don't share the same physical hosts and disks.
58
-
|service| satisfies this requirement by default: it deploys nodes in
59
-
different availability zones, on different physical hosts and disks.
60
-
61
-
Ensure that replica sets don't share resources by :ref:`distributing data across data centers <arch-center-distribute-data>`.
62
-
63
-
Use an Odd Number of Replica Set Members
64
-
````````````````````````````````````````
65
-
66
-
To elect a :manual:`primary </core/replica-set-members>`, you need a majority of :manual:`voting </core/replica-set-elections>` replica set members available. We recommend that you create replica sets with an
67
-
odd number of voting replica set members. There is no benefit in having
68
-
an even number of voting replica set members. |service| satisfies this
69
-
requirement by default, as |service| requires having 3, 5, or 7 nodes.
70
-
71
-
Fault tolerance is the number of replica set members that can become
72
-
unavailable with enough members still available for a primary election.
73
-
Fault tolerance of a four-member replica set is the same as for a three-member replica set because both can withstand a single-node
74
-
outage.
75
-
76
-
To learn more about replica set members, see :manual:`Replica Set Members </core/replica-set-members>`. To learn more about replica set elections and
77
-
voting, see :manual:`Replica Set Elections </core/replica-set-elections>`.
78
-
79
-
.. _arch-center-distribute-data:
80
-
81
-
Distribute Data Across At Least Three Data Centers in Different Availability Zones
To guarantee that a replica set can elect a primary if a data center
85
-
becomes unavailable, you must distribute nodes across at least three
86
-
data centers, but we recommend that you use five data centers.
87
-
88
-
If you choose a region for your data centers that supports
89
-
availability zones, you can distribute nodes in data centers in different
90
-
availability zones. This way you can have multiple separate physical data
91
-
centers, each in its own availability zone and in the same region.
92
-
93
-
This section aims to illustrate the need for a deployment with five data centers.
94
-
To begin, consider deployments with two and three data centers.
95
-
96
-
Consider the following diagram, which shows data distributed across
97
-
two data centers:
98
-
99
-
.. figure:: /includes/images/two-data-centers.png
100
-
:figwidth: 750px
101
-
:alt: An image showing two data centers: Data Center 1, with a primary and a secondary node, and Data Center 2, with only a secondary node
102
-
103
-
In the previous diagram, if Data Center 2 becomes unavailable, a majority of replica set members remain available and {+service+} can elect a primary. However, if you lose Data Center 1, you have only one
104
-
out of three replica set members available, no majority, and the system degrades into read-only mode.
105
-
106
-
Consider the following diagram, which shows data distributed across
:alt: An image showing three data centers: Data Center 1, with a primary node, Data Center 2, with a secondary node, and Data Center 3, with a secondary node
112
-
113
-
When you distribute nodes across three data centers, if one data
114
-
center becomes unavailable, you still have two out of three replica set
115
-
members available, which maintains a majority to elect a primary.
116
-
117
-
In addition to ensuring high availability, we recommend that you ensure the continuity of write
118
-
operations. For this reason, we recommend that you deploy five data centers,
119
-
to achieve a 2+2+1 topology required for the majority write concern.
120
-
See the following section on :ref:`majority write concern <arch-center-majority-write-concern>` in this topic for detailed explanations of this
121
-
requirement.
122
-
123
-
You can distribute data across at least three data centers within the same region by choosing a region with at least three availability zones. Each
124
-
availability zone contains one or more discrete data centers, each with redundant power, networking and connectivity, often housed in separate
125
-
facilities.
126
-
127
-
{+service+} uses availability zones for all cloud providers
128
-
automatically when you deploy a dedicated cluster to a region that supports availability zones. |service| splits the cluster's nodes across
129
-
availability zones. For example, for a three-node replica set {+cluster+} deployed to a three-availability-zone region, {+service+} deploys one node
130
-
in each zone. A local failure in the data center hosting one node doesn't
131
-
impact the operation of data centers hosting the other nodes because MongoDB
132
-
performs automatic failover and leader election. Applications automatically
133
-
recover in the event of local failures.
134
-
135
-
We recommend that you deploy replica sets to the following regions because they support at least three availability zones:
When a client connects to a sharded {+cluster+}, we recommend that you include multiple :manual:`mongos </reference/program/mongos/>`
156
-
processes, separated by commas, in the connection URI. To learn more,
157
-
see :manual:`MongoDB Connection String Examples </reference/connection-string-examples/#self-hosted-replica-set-with-members-on-different-machines>`.
158
-
This setup allows operations to route to different ``mongos`` instances
159
-
for load balancing, but it is also important for disaster recovery.
160
-
161
-
Consider the following diagram, which shows a sharded {+cluster+}
162
-
spread across three data centers. The application connects to the {+cluster+} from a remote location. If Data Center 3 becomes unavailable, the application can still connect to the ``mongos``
:alt: An image showing three data centers: Data Center 1, with a primary shards and two mongos, Data Center 2, with secondary shards and two mongos, and Data Center 3, with secondary shards and two mongos. The application connects to all six mongos instances.
168
-
169
-
You can use
170
-
:manual:`retryable reads </core/retryable-reads/>` and
171
-
:manual:`retryable writes</core/retryable-writes/>` to simplify the required error handling for the ``mongos``
172
-
configuration.
173
-
174
-
.. _arch-center-majority-write-concern:
175
-
176
-
Use ``majority`` Write Concern
177
-
``````````````````````````````
178
-
179
-
MongoDB allows you to specify the level of acknowledgment requested
180
-
for write operations by using :manual:`write concern
181
-
</reference/write-concern/>`. For example, if
182
-
you had a three-node replica set and had a write concern of
183
-
``majority``, every write operation would need to be persisted on
184
-
two nodes before an acknowledgment of completion sends to the driver
185
-
that issued said write operation. For the best protection from a regional
186
-
node outage, we recommend that you set the write concern to ``majority``.
187
-
188
-
Even though using ``majority`` write concern increases latency, compared
189
-
with write concern ``1``, we recommend that you use ``majority`` write
190
-
concern because it allows your data centers to continue having write
191
-
operations even if a replica set loses the primary.
192
-
193
-
To understand the importance of ``majority`` write concern, consider a
194
-
five-node replica set spread across three separate regions with a 2-2-1
195
-
topology (two regions with two nodes and one region with one node),
196
-
with a write concern of ``4``. If one of the regions with two nodes becomes
197
-
unavailable due to an outage and only three nodes are available, no write
198
-
operations complete and the operation hangs because it is unable to persist data on four nodes.
199
-
In this scenario, despite the availability of the majority of nodes in the replica set, the database
200
-
behaves the same as if a majority of the nodes in the replica set were
201
-
unavailable. If you use ``majority`` write concern rather than a numeric value, it prevents this scenario.
202
-
203
-
Consider Backup Configuration
204
-
`````````````````````````````
205
-
Frequent data backups is critical for business continuity and disaster recovery. Frequent backups ensure that data loss and downtime is
206
-
minimal if a disaster or cyber attack disrupts normal operations.
207
-
208
-
We recommend that you:
209
-
210
-
- Set your backup frequency to meet your desired business continuity
211
-
objectives. Continuous backups may be needed for some systems, while less frequent snapshots may be desirable for others.
212
-
- Store backups in a different physical location than the
213
-
source data.
214
-
- Test your backup recovery process to ensure that you can restore
215
-
backups in a repeatable and timely manner.
216
-
- Confirm that your {+clusters+} run the same MongoDB versions for
217
-
compatibility during restore.
218
-
- Configure a :atlas:`backup compliance policy
219
-
</backup/cloud-backup/backup-compliance-policy/#std-label-backup-compliance-policy>` to prevent deleting backup
220
-
snapshots, prevent decreasing the snapshot retention time, and more.
221
-
222
-
For more backup recommendations, see :ref:`arch-center-backups`.
223
-
224
-
Plan Your Resource Utilization
225
-
``````````````````````````````
226
-
227
-
To avoid resource capacity issues, we recommend that you monitor
228
-
resource utilization and hold regular capacity planning sessions.
229
-
MongoDB Professional Services offers these sessions.
230
-
231
-
Over-utilized clusters could fail causing a disaster.
232
-
Scale up clusters to higher tiers if your utilization is regularly alerting at a steady state,
233
-
such as above 60%+ utilization for system CPU and system memory.
234
-
235
-
To view your resource utilization, see :atlas:`Monitor Real-Time Performance </real-time-performance-panel>`. To view metrics with the {+atlas-admin-api+}, see :oas-atlas-tag:`Monitoring and Logs </Monitoring-and-Logs>`.
236
-
237
-
To learn best practices for alerts and monitoring for resource
238
-
utilization, see :ref:`arch-center-monitoring-alerts`.
239
-
240
-
If you encounter resource capacity issues, see :ref:`arch-center-resource-capacity`.
241
-
242
-
Plan Your MongoDB Version Changes
243
-
`````````````````````````````````
244
-
245
-
We recommend that you run the latest MongoDB version as it allows you to
246
-
take advantage of new features and provides improved security guarantees
247
-
compared with previous versions.
248
-
249
-
Ensure that you perform MongoDB major version upgrades far before your
250
-
current version reaches `end of life <https://www.mongodb.com/legal/support-policy/lifecycles>`__.
251
-
252
-
You can't downgrade your MongoDB version using the {+atlas-ui+}. Because of this,
253
-
when planning and executing a major version upgrade, we recommend that you
254
-
work directly with MongoDB Professional or Technical Services to help you
255
-
avoid any issues that might occur during the upgrade process.
0 commit comments