diff --git a/.ext/mongodb_domain.py b/.ext/mongodb_domain.py index 980e3cdcf75..c6899769e2b 100644 --- a/.ext/mongodb_domain.py +++ b/.ext/mongodb_domain.py @@ -124,6 +124,8 @@ def get_index_text(self, objectname, name_obj): return _('%s (shell output)') % (name) elif self.objtype == 'function': return _('%s (shell method)') % (name) + elif self.objtype == 'readmode': + return _('%s (read preference mode)') % (name) return '' def run(self): @@ -195,6 +197,7 @@ class MongoDBDomain(Domain): 'setting': ObjType(l_('setting'), 'setting'), 'status': ObjType(l_('status'), 'status'), 'stats': ObjType(l_('stats'), 'stats'), + 'readmode': ObjType(l_('readmode'), 'readmode'), 'function': ObjType(l_('function'), 'func'), 'data': ObjType(l_('data'), 'data'), 'aggregator': ObjType(l_('aggregator'), 'aggregator'), @@ -209,6 +212,7 @@ class MongoDBDomain(Domain): 'setting': MongoDBCallable, 'status': MongoDBCallable, 'stats': MongoDBCallable, + 'readmode': MongoDBCallable, 'function': MongoDBCallableComplex, 'data': MongoDBCallable, 'aggregator': MongoDBCallable, @@ -222,6 +226,7 @@ class MongoDBDomain(Domain): 'setting': MongoDBXRefRole(), 'status': MongoDBXRefRole(), 'stats': MongoDBXRefRole(), + 'readmode': MongoDBXRefRole(), 'func': MongoDBXRefRole(), 'data': MongoDBXRefRole(), 'aggregator': MongoDBXRefRole(), diff --git a/source/applications/replication.txt b/source/applications/replication.txt index 6e7ce9d5f06..e329ea30760 100644 --- a/source/applications/replication.txt +++ b/source/applications/replication.txt @@ -34,7 +34,7 @@ issues a :dbcommand:`getLastError` command following write operations to ensure that they succeed. "Safe mode," provides confirmation of write operations to clients, which is often the expected method of operation, and is particularly useful when -using standalone nodes. +using standalone instances. However, safe writes can take longer to return and are not required in all applications. Using the ``w: "majority"`` option for @@ -42,7 +42,7 @@ all applications. Using the ``w: "majority"`` option for return only after a write operation has replicated to a majority of the members of the set. At the :program:`mongo` shell, use the following command to ensure that writes have propagated to a majority -of the nodes in the replica set: +of the replica set members: .. code-block:: javascript @@ -50,7 +50,7 @@ of the nodes in the replica set: db.getLastError("majority") You may also specify ``w: 2`` so that the write operation replicates -to a second node before the command returns. +to a second member before the command returns. .. note:: @@ -70,7 +70,7 @@ replica set configuration. For instance: When the new configuration is active, the effect of the :dbcommand:`getLastError` operation will wait until the write -operation has succeeded on a majority of the nodes before writing. By +operation has succeeded on a majority of the set members before writing. By specifying ``fsync: false`` and ``j: true`` a successful commit of the operation to the journal is all that :dbcommand:`getLastError` requires to return succesullly, rather than a full flush to disk. Use this the @@ -88,32 +88,40 @@ arguments. Read Preference --------------- -By default, applications direct queries (i.e. read operations) to the :term:`primary` member in -a :term:`replica set`. To distribute reads to :term:`secondary` members, issue the -following command in the :program:`mongo` shell to enable secondary -reads: - -.. code-block:: javascript - - rs.slaveOk() - -MongoDB :term:`drivers ` allow client applications to configure a -:term:`read preference` on a per-connection or per-operation -basis. See ":func:`rs.slaveOk()`" for more information about secondary -read operations in the :program:`mongo` shell, as well as the -appropriate :ref:`driver` API documentation for more information about read -preference configuration. - -MongoDB does not guarantee that secondary members will be consistent with the state -of the primary member in a replica set. As a result, setting a :term:`read preference` -that allows reading from secondary members, accepts the possibility of -:term:`eventually consistent ` read -operations. Do not allow secondary reads, unless you can accept -eventual consistency for these operations. - -In many cases, your application will require the :term:`strict consistency` -that primary reads provide. However, typically secondary reads are -useful for: +Read preference describes how MongoDB clients route read operations to +:term:`secondary` members of a :term:`replica set`. The +:ref:`background ` section +introduces read preference, while the :ref:`mode semantics +` and :ref:`behavior +` sections address the way that +MongoDB clients and drivers can handle read preferences. Finally, the +:ref:`tag sets ` section +outlines read preference tagging for custom read preference +configuration, which provides a component of "data center awareness." + +.. index:: read preference; background +.. _replica-set-read-preference-background: + +Background +~~~~~~~~~~ + +By default, applications direct read operations to the +:term:`primary` member in a :term:`replica set`. When reading from the +the primary, all read operations reflect the latest version of the document. +However, you can improve read +throughput by distributing some or all reads to secondary members of +the replica set as needed. While secondary reads may return stale data +this is acceptable for applications that do not require fully up to date data. + +Reads from secondaries may be stale MongoDB can +not guarantee that secondary members will reflect the state +of the primary, which receives all incoming write operations. Do not +allow secondary reads, unless your application can handle receiving +stale data. + +In many cases, your application will require the most current document +that primary reads provide. However, the following use cases outline +several potetial use cases for other read preferences: - running systems operations including backups and reports without impacting the front-end application. @@ -123,24 +131,410 @@ useful for: than the primary or the rest of the set, you may see better performance for that application if you use secondary reads. -Secondary reads also provide graceful degradation in -:ref:`failover ` situations. During failover, replica sets may take -10 seconds or more to new primary member. Because there is no primary, -applications will not be unable to perform primary reads during this period. -With read preference set to read from secondary -members, your application can continue read operations using the -remaining secondary members. - -Read preferences only control the consistency of query results from a -replica set. Use appropriate :ref:`write concern ` policies to ensure -proper data replication and constancy. +- providing graceful degradation in :ref:`failover + ` situations, when a set can have *no* primary + for 10 seconds or more. If a set does not have a primary, + applications with :readmode:`primary` cannot perform reads. Use the + :readmode:`primaryPrefered` read preference in this situation. + +MongoDB :term:`drivers ` allow client applications to +configure a :term:`read preference` on a per-connection, per-collection or +per-operation basis. See the :func:`rs.slaveOk()` method for more +information about secondary read operations in the :program:`mongo` +shell, as well as the appropriate :ref:`driver` API documentation for +more information about read preference configuration. The +:ref:`semantics ` and +:ref:`behavior ` sections also +address read preference configuration and operations. .. note:: - Read preference have no affect on write performance or write - operations. If a large portion of your applications traffic are - read operations, distributing reads to secondary members may - provide some performance improvement. However, in most cases - :doc:`sharding ` will provide better support for - larger scale operations, because shard clusters can distribute read - and write operations across a group of machines. + Read preferences affect how the application selects a member of the + replica set to use for read operations. As a result read + preferences dictate if the application will receive stale or + current data from MongoDB. Use appropriate :ref:`write concern + ` policies to ensure proper data + replication and constancy. + + If read operations account for a large percentage of your + application's traffic, distributing reads to secondary members may + improve read throughput. However, in most cases :doc:`sharding + ` will provide better support for larger scale + operations, because shard clusters can distribute read and write + operations across a group of machines. + +.. index:: read preference; semantics +.. _replica-set-read-preference-semantics: +.. index:: read preference; modes +.. _replica-set-read-preference-modes: + +Read Preference Modes +~~~~~~~~~~~~~~~~~~~~~ + +.. versionadded:: 2.2 + +All MongoDB drivers :doc:`drivers ` support the +following read preference modes. These semantics make it possible to +specify read preference on a per-collection or per-operation +basis. The member of the :term:`replica set` that the client reads from +can affect how current or stale the result set is. See the :ref:`read +preference background ` and +:ref:`read preference behavior ` +for more information. Also see the :api:`documentation for your driver +<>` for specific read preference use. :program:`mongos` also supports +all read-preference modes in its connections to the replica sets that +provide each :term:`shard` in a :term:`shard cluster`. + +All MongoDB drivers provide five read +preference modes, which are constants set in the drivers +themselves. While the names are the same, the exact syntax depends the idioms +of the host language. In the :program:`mongo` shell, the +:func:`readPreference() ` cursor method +provides access to read preferences, which have the following names. + +.. readmode:: primary + + With :readmode:`primary`, all read operations from the client will + use the :term:`primary` member only. + + This is the default read preference. + + If the primary is unavailable, all operations with this preference + produce an error or throw an exception. :readmode:`primary` read + preference modes are not compatible with read preferences modes + that use :ref:`tag sets ` If + you specify a tag set with :readmode:`primary`, the driver will + produce an error. + + The :readmode:`primary` mode sacrifices availability for + consistency, in terms of the :term:`CAP Theorm`. + +.. readmode:: primaryPrefered + + With the :readmode:`primaryPrefered` read preference mode, + operations will read from the :term:`primary` member of the set in + most situations. However, if the primary is unavailable, as is the + case during :term:`failover` situations, then these read operations + can read from secondary members. + + When the read preference includes a :ref:`tag set `, + the client will first read from the primary, if it is available, and + then from :term:`secondaries ` that match the specified + tags. If there are no secondaries with tags that match the + specified tags, this read operation will produce an error. + + The :readmode:`primaryPrefered` mode sacrifices consistency for + greater availability, in terms of the :term:`CAP Theorm`. + +.. readmode:: secondary + + With the :readmode:`secondary` read preference mode, operations + will read from the :term:`secondary` member of the set if available. + However, if there are no secondaries available, then these + operations will produce an error or exception. + + Most sets have at least one secondary, but there are situations + where there may not be an available secondary. For example, a set + with a primary, a secondary, and an :term:`arbiter` may not have + any secondaries if a member is ever in recovering mode. + + When the read preference includes a :ref:`tag set `, + the client will attempt to find a secondary members that match the + specified tag set and directs reads to a random secondary from + among the :ref:`nearest group `. + If there are no secondaries with tags that match the specified tag + set, this read operation will produce an error. + + The :readmode:`secondary` mode sacrifices consistency for + greater availability, in terms of the :term:`CAP Theorm`. + +.. readmode:: secondaryPrefered + + With the :readmode:`secondaryPrefered`, operations will read from + :term:`secondary` members, but in situations where the set *only* + has a :term:`primary` instance, the read operation will use the + set's primary. + + When :readmode:`secondaryPrefered` reads from a secondary and the + read preference includes a :ref:`tag set `, + the client will attempt to find a secondary members that match the + specified tag set and directs reads to a random secondary from + among the :ref:`nearest group `. + If there are no secondaries with tags that match the specified tag + set, this read operation will produce an error. + + The :readmode:`secondaryPrefered` mode sacrifices consistency for + greater availability, in terms of the :term:`CAP Theorm`. + +.. readmode:: nearest + + With the :readmode:`nearest`, the driver will read from the + *nearest* member of the :term:`set ` according to + the :ref:`member selection ` + process :readmode:`nearest` read operations will not have any + consideration for the *type* of the set member. Reads in + :readmode:`nearest` mode may read from both primaries and + secondaries. + + Set this mode when you want minimize the effect of network latency + on read operations without preference for current or stale data. + + If you specify a :ref:`tag set `, + the client will attempt to find a secondary members that match the + specified tag set and directs reads to a random secondary from + among the :ref:`nearest group `. + + .. note:: + + All operations read from the nearest member of the replica set + that matches the specified read preference mode. + :readmode:`nearest` prefers low latency reads over a + member's :term:`primary` or :term:`secondary` status. + + .. For I/O-bound users who want to distribute reads across all + members evenly regardless of ping time, set + secondaryAcceptableLatencyMS very high. + +.. The :func:`readPreference() ` reference + above will error until DOCS-364 is complete. + +.. index:: tag sets +.. index:: read preference; tag sets +.. _replica-set-read-preference-tag-sets: + +Tag Sets +~~~~~~~~ + +Tag sets allow you to specify custom :ref:`read preferences +`, so that your application can target +read operations to specific members based on custom parameters. Tag +sets make it possible to ensure that read operations target members of +the set in a particular data center, or :program:`mongod` instances +designated for a particular class of operations, such as reporting or +analytics. The :doc:`/reference/replica-configuration` document +contains a section on :ref:`tag set configuration +`. + +Read preference :ref:`modes ` interact +with tag sets, as specified in the documentation of each read +preference mode. The :readmode:`primary` read preference mode is +incompatible with tag sets, but you may specify a tag set with one of +each of the following read preference modes: + +- :readmode:`primaryPrefered` +- :readmode:`secondary` +- :readmode:`secondaryPrefered` +- :readmode:`nearest` + +Tags only apply when :ref:`selecting ` +a :term:`secondary` member of the set, *except* for the +:readmode:`nearest` mode. See the documentation of each read preference mode for more +information on the interaction of the read preference mode and tag +sets. All interfaces use the same :ref:`member selection logic +` to choose a +member to direct read operations to based on read preference mode and +tag sets. + +.. index:: read preference; behavior +.. _replica-set-read-preference-behavior: + +Behavior +~~~~~~~~ + +.. versionchanged:: 2.2 + +.. _replica-set-read-preference-behavior-retry: + +Auto-Retry +`````````` + +Connection between MongoDB drivers and :program:`mongod` instances in +a :term:`replica set` must balance two concerns: + +#. The client should attempt to prefer current results and any + connection should read from the same member of the replica set as + much as possible. + +#. The client should minimize the amount of time that the database is + inaccessible as the result of a connection issue, networking + problem, or :term:`failover` in a replica set. + +As a result, MongoDB drivers and :program:`mongos` will: + +- reuse a connection to specific :program:`mongod` for as long as + possible after establishing a connection to that instance. This + connection is *pinned* to this :program:`mongod`. + +- attempt to reconnect to a new member, obeying existing :ref:`read + preference mode ` when it lose a + connection to its :program:`mongod`. + + These reconnections are transparent to the application itself. If + the connection permits reads from :term:`secondary` members, after + reconnecting, the application can receive two sequential reads + returning from different secondaries. Depending on the state of the + individual secondary member's replication, the documents can reflect + the state of your database at different moments. + +- return an error *only* after attempting to connect to three members + of the set that match the :term:`read preference mode ` + and :ref:`tag set `. + If there are fewer than three members of the set, the + client will error after connecting to all existing members of the + set. + + After this error, the driver will select a new member using the + specified read preference mode, or in absence of a specified read + preference :readmode:`PRIMARY`. + +- after detecting a failover situation, [#fn-failover]_ the driver will + attempt to refresh its state of the replica set as quickly as + possible. + +.. [#fn-failover] When a :term:`failover` occurs, all members of the set + closes all client connections producing a socket error in the + driver. This behavior prevents or minimized :term:`rollback`. + +.. _replica-set-read-preference-behavior-requests: + +Request Length +`````````````` + +MongoDB assumes that connections between the client and database are +long-lived and that the client will reuse a single connection and +corresponding thread for many operations. When you set a :ref:`read +preference mode `, that mode will +persist for all operations that use this thread. A read preference +mode will persist on a per-connection basis until: + +- the application sets a new read preference mode, in a new + operation. + + This behavior permits clients to set read preferences on a per + operation basis. + +- the application (i.e. driver) connection thread goes away, as + the result of normal application processes. + + Typically this triggers a :ref:`retry + `, which may be + transparent to the application. + +- the client receives a socket exception, as is the case when + there's a connection error, or when the :program:`mongod` closes + connections during a :term:`failover`. + +As a result, unless you explicitly set read preference modes on your +connections or operations, read operations may run with unexpected +read preference modes. + +.. index:: read preference; ping time +.. index:: read preference; nearest +.. index:: read preference; member selection +.. _replica-set-read-preference-behavior-ping-time: +.. _replica-set-read-preference-behavior-nearest: +.. _replica-set-read-preference-behavior-member-selection: + +Member Selection +```````````````` + +Clients by way of their drivers, and :program:`mongos` instances for +shard clusters, send periodic "ping," messages to all member of the +replica set to determine latency from the application to each +:program:`mongod` instance. + +For any operation that will target a member *other* than the +:term:`primary`, the driver will: + +#. assemble a list of suitable members, taking account member type + (i.e. secondary, primary, or all members.) + +#. determine which of these suitable members is the closest to the + client in absolute terms. + +.. defined fix + +#. build a list of members that are within a defined ping distance (in + milliseconds) of the "absolute nearest" member. [#acceptable-secondary-latency]_ + +#. select a member to perform the read operation on from these hosts + at random. + +Once the application selects a member of the set to use for read +operations, the driver will continue to use this connection for read +preference until the application specifies a new read preference or +something interrupts the connection. See ":ref:`replica-set-read-preference-behavior-requests`" +for more information. + +.. [#acceptable-secondary-latency] Applications can configure the + threshold used in this stage. The default "acceptable latency" is + 15 milliseconds. For :program:`mongos` you can use the + :option:`--localThreshold ` or + :setting:`localThreshold` runtime options to set this value. + +.. index:: read preference; sharding +.. index:: read preference; mongos +.. _replica-set-read-preference-behavior-sharding: +.. _replica-set-read-preference-behavior-mongos: + +Sharding and ``mongos`` +``````````````````````` + +.. versionchanged:: 2.2 + Before version 2.2, :program:`mongos` did not support the + :ref:`read preference mode semantics `. + +In most :term:`shard clusters `, a :term:`replica set` +provides each shard, where read preferences are also applicable. Read +operations in a shard cluster, with regard to read preference, are +identical to unsharded replica sets. + +Unlike simple replica sets, in shard clusters, all interactions with +the shards pass from the clients to the :program:`mongos` instances +that are actually connected to the set members. :program:`mongos` is +responsible for the application of the read preferences, which is +transparent to applications. + +There are no configuration changes required for full support of read +preference modes in sharded environments, as long as the +:program:`mongos` is at least version 2.2. All :program:`mongos` +maintain their own connection pool to the replica set members, as a +result: + +- a request without a specified preference will have + :readmode:`primary`, the default, unless, the :program:`mongos` + reuses and existing connection that had a different mode set. + + Always explicitly set your read preference mode to prevent + confusion. + +- all :readmode:`nearest` and latency calculations reflect the + connection between the :program:`mongos` and the :program:`mongod` + instances, not the client and the :program:`mongod` instances. + + This produces the desired result, because all results must pass + through the :program:`mongos` before returning to the client. + +Database Commands +````````````````` + +Because some :term:`database commands ` read and +return data from the database, all of the official drivers support +full :`read preference mode semantics +` for the following commands: + +- :dbcommand:`group` +- :dbcommand:`mapReduce` [#inline-map-reduce]_ +- :dbcommand:`aggregate` +- :dbcommand:`collStats` +- :dbcommand:`dbStats` +- :dbcommand:`count` +- :dbcommand:`distinct` +- :dbcommand:`geoNear` +- :dbcommand:`geoSearch` +- :dbcommand:`geoWalk` + +.. [#inline-map-reduce] Only "inline" :dbcommand:`mapReduce` + operations that do not write data support read preference, + otherwise these operations must run on the :term:`primary` + members. diff --git a/source/reference/configuration-options.txt b/source/reference/configuration-options.txt index 0bdb9d3050c..09f551f5607 100644 --- a/source/reference/configuration-options.txt +++ b/source/reference/configuration-options.txt @@ -464,8 +464,8 @@ Settings Use the :option:`mongod --repair` option to access this functionality. - - .. note:: + + .. note:: Because :program:`mongod` rewrites all of the database files during the repair routine, if you do not run :setting:`repair` @@ -724,3 +724,36 @@ Sharding Cluster Options later, the new value will have no effect. See the ":ref:`sharding-balancing-modify-chunk-size`" procedure if you need to change the chunk size on an existing shard cluster. + +.. setting:: localThreshold + + .. versionadded:: 2.2 + + :setting:`localThreshold` affects the logic that program:`mongos` + uses when selecting :term:`replica set` members to pass reads + operations to from clients. Specify a value to + :setting:`localThreshold` in milliseconds. The default value is + ``15``, which corresponds to the default value in all of the client + :doc:`drivers `. + + This setting only affects :program:`mongos` processes. + + When :program:`mongos` receives a request that permits reads to + :term:`secondary` members, the :program:`mongos` will: + + - find the nearest suitable member of the set, in terms of ping time. + + - construct a list of replica set members that is within a ping + time of 15 milliseconds of the nearest suitable member of the + set. + + If you specify a value for :setting:`localThreshold`, + :program:`mongos` will construct the list of replica members + that are within the latency allowed by this value. + + - The :program:`mongos` will select a member to read from at + random from this list. + + See the :ref:`replica-set-read-preference-behavior-member-selection` + section of the :ref:`read preference ` + documentation for more information. diff --git a/source/reference/glossary.txt b/source/reference/glossary.txt index 463bcd98607..1420d1c11b4 100644 --- a/source/reference/glossary.txt +++ b/source/reference/glossary.txt @@ -814,3 +814,9 @@ Glossary Geohash A value is a binary representation of the location on a coordinate grid. + + CAP Theorm + Given three properties of computing systems, consistency, + availability, and partition tolerance, a distributed computing + system can provide any two of these features, but never all + three. diff --git a/source/reference/mongos.txt b/source/reference/mongos.txt index 14447dfaa11..1d3a77be5c5 100644 --- a/source/reference/mongos.txt +++ b/source/reference/mongos.txt @@ -227,3 +227,34 @@ Options .. versionadded:: 2.1.2 Disables the HTTP interface. + +.. option:: --localThreshold + + .. versionadded:: 2.2 + + :option:`--localThreshold` affects the logic that program:`mongos` + uses when selecting :term:`replica set` members to pass reads + operations to from clients. Specify a value to + :option:`--localThreshold` in milliseconds. The default value is + ``15``, which corresponds to the default value in all of the client + :doc:`drivers `. + + When :program:`mongos` receives a request that permits reads to + :term:`secondary` members, the :program:`mongos` will: + + - find the nearest suitable member of the set, in terms of ping time. + + - construct a list of replica set members that is within a ping + time of 15 milliseconds of the nearest suitable member of the + set. + + If you specify a value for :option:`--localThreshold`, + :program:`mongos` will construct the list of replica members + that are within the latency allowed by this value. + + - The :program:`mongos` will select a member to read from at + random from this list. + + See the :ref:`replica-set-read-preference-behavior-member-selection` + section of the :ref:`read preference ` + documentation for more information. diff --git a/source/reference/replica-configuration.txt b/source/reference/replica-configuration.txt index 7f0758f9b9a..53618a92ca2 100644 --- a/source/reference/replica-configuration.txt +++ b/source/reference/replica-configuration.txt @@ -249,8 +249,8 @@ replica set configuration document. Square brackets (e.g. ``[`` and .. _replica-set-reconfiguration-usage: -Usage ------ +Use +--- Most modifications of replica set configuration use the :program:`mongo` shell. Consider the following example: @@ -303,3 +303,91 @@ use the following form: disconnect. This is by design. While this typically takes 10-20 seconds, attempt to make these changes during scheduled maintenance periods. + +.. index:: replica set; tag sets +.. index:: read preference; tag sets +.. index:: tag sets; configuration +.. _replica-set-configuration-tag-sets: + +Tag Sets +-------- + +Tag sets provide custom and configurable :ref:`write concern +` and :ref:`read preferences +`. This section will outline the process +for specifying tags for a replica set, for more information see the +full documentation of the behavior of :ref:`tags sets write concern +` and :ref:`tag sets for read preference +`. + +Configure tag sets by adding fields and values to the document stored +in the :data:`members[n].tags`. Consider the following example: + +.. example:: + + Given the following replica set configuration: + + .. code-block:: javascript + + { + "_id" : "rs0", + "version" : 1, + "members" : [ + { + "_id" : 0, + "host" : "mongodb0.example.net:27017", + }, + { + "_id" : 1, + "host" : "mongodb1.example.net:27017" + }, + { + "_id" : 2, + "host" : "mongodb2.example.net:27017", + } + ] + } + + And the following tag set: + + .. code-block:: javascript + + { "dc": "east", "use": "reporting" } + + You could add the tag set, to the ``members[1]`` member of the + replica set, with the following command sequence in the + :program:`mongo` shell: + + .. code-block:: javascript + + conf = rs.conf() + conf.members[1].tags = { "dc": "east", "use": "reporting" } + rs.reconfig(conf) + + After this operation the output of :func:`rs.conf()`, would + resemble the following: + + .. code-block:: javascript + + { + "_id" : "rs0", + "version" : 2, + "members" : [ + { + "_id" : 0, + "host" : "mongodb0.example.net:27017", + }, + { + "_id" : 1, + "host" : "mongodb1.example.net:27017" + "tags" : { + "dc": "east", + "use": "reporting" + } + }, + { + "_id" : 2, + "host" : "mongodb2.example.net:27017", + } + ] + }