From 748081b72c472d74a00c95bbfb239478f2d58c0a Mon Sep 17 00:00:00 2001 From: Bob Grabar Date: Thu, 25 Oct 2012 17:32:38 -0400 Subject: [PATCH 1/5] DOCS-249 write operations: first draft --- draft/core/write-operations.txt | 223 +++++++++++++++++++++---- source/administration/replica-sets.txt | 12 ++ source/administration/sharding.txt | 8 + 3 files changed, 214 insertions(+), 29 deletions(-) diff --git a/draft/core/write-operations.txt b/draft/core/write-operations.txt index 6058d6ca7f1..007454083fe 100644 --- a/draft/core/write-operations.txt +++ b/draft/core/write-operations.txt @@ -4,23 +4,29 @@ Write Operations .. default-domain:: mongodb -Synopsis --------- +Write operations create, update, and delete data in MongoDB databases. +MongoDB databases store data as :term:`documents ` in +:term:`collections `. + +This section of the manual describes how MongoDB performs write +operations and how different factors affect the efficiency of those +operations. + +.. index:: write operators +.. _write-operations-operators: -Operations ----------- +Write Operators +--------------- -The :doc:`/crud` section of this manual contains specific -documentation for the major classes of write operations for MongoDB -databases. Read the following pages for additional examples and -documentation: +For information on write operators and how to write data to a MongoDB +database, see the following: -:doc:`/applications/create` -:doc:`/applications/delete` -:doc:`/applications/update` +- :doc:`/applications/create` +- :doc:`/applications/update` +- :doc:`/applications/delete` -Also consider the following methods in the :program:`mongo` JavaScript -shell that allow you to write or change data in a MongoDB database. +For information specific :program:`mongo` shell methods used in write +operations, see the following: - :method:`db.collection.insert()` - :method:`db.collection.update()` @@ -29,43 +35,202 @@ shell that allow you to write or change data in a MongoDB database. - :method:`db.collection.remove()` - :method:`db.collection.delete()` -Consider the documentation for your client library or :doc:`driver -` for more information on how to access this -functionality from within your application. +For information on how to perform write operations from within an +application, see the :doc:`/applications/drivers` documentation or the +documentation for your client library. + +.. index:: write concern +.. _write-operations-write-concern: + +Write Concern +------------- + +.. DONE The information on the replica-set page has now been split between + this page and the replica-set page. + +:term:`Write concern ` confirms the success of write +operations to MongoDB databases by returning an object indicating +operational success. Beginning with version 2.4, the :program:`mongo` +shell enables write concern by default. In previous versions the shell +disabled write concern by default. + +Many drivers also enable write concern by default. For information on +your driver, see the :doc:`/applications/drivers` documentation. + +.. todo add note about all drivers after `date` will have w:1 write + concern for all operations by default. + +Write concern issues the :dbcommand:`getLastError` command after write +operations, and the command returns an object with success information. +The returned object's ``err`` field contains either a value of ``null`` +to indicate no errors, in which case the write operations have completed +successfully, or contains a description of the last error encountered. + +A successful write operation means the :program:`mongod` instance +received the write operation and has committed the operation to the +in-memory representation of the database. This provides a simple and +low-latency level of write concern and will allow your application to +detect situations where the :program:`mongod` instance becomes +inaccessible or insertion errors caused by :ref:`duplicate key errors +`. + +You can modify the level of write concern returned by issuing the +:dbcommand:`getLastError` command with one or both of following options: + +- ``j`` or "journal" option + + In addition to the default confirmation, this option confirms that the + :program:`mongod` instance has written the data to the on-disk + journal. This ensures the data is durable if :program:`mongod` or the + server itself crashes or shuts down unexpectedly. + +- ``w`` option + + This option is used either to configure write concern on members of + :term:`replica sets ` or to disable write concern. By + default, the ``w`` option is set to ``1``, which enables write + concern on a single :program:`mongod` instance. + + In the case of replica sets, the value of ``1`` enables write concern + on the :term:`primary` only. To configure write concern to confirm + that writes have replicated to a specified number of replica set + members, see :ref:`Write Concern for Replica Sets + `. + + To disable write concern, set the ``w`` option to ``0``, as shown in + the following example: + + .. code-block:: javascript -Write Concern and Write Safety ------------------------------- + db.runCommand( { getLastError: 1, w: 0 } ) -.. todo:: import and tweak section from the replica-set page. When we - publish this document we'll have to do a quick deletion/reduction - of the replica-set section, but during the editorial process the - content can be duplicated. + .. note:: Write concern provides confirmation of write operations but also adds + to performance costs. In situations where confirmation is + unnecessary, it can be advantageous to disable write concern. + +.. _write-operations-bulk-insert: Bulk Inserts ------------ -:issue:`SERVER-2395` +Bulk inserts let you insert many documents in a single database call. + +Bulk inserts allow MongoDB to distribute the performance penalty when +performing inserts to a large number of documents at once. Bulk inserts +let you pass multiple events to the :method:`insert()` method at once. +All write concern options apply to bulk inserts. + +You perform bulk inserts through your driver. See the +:doc:`/applications/drivers` documentation for your driver for how to do +bulk inserts. + +Beginning with version 2.2, you also can perform bulk inserts through +the :program:`mongo` shell. + +Beginning with version 2.0, you can set the ``ContinueOnError`` flag for +bulk inserts to signal inserts should continue even if one or more from +the batch fails. In that case, if multiple errors occur, only the most +recent is reported by the :dbcommand:`getLastError` command. For a +:term:`sharded collection`, ``ContinueOnError`` is implied and cannot be +disabled. + +If you insert data without write concern, the bulk insert gain might be +insignificant. But if you insert data with write concern configured, +bulk insert can bring significant performance gains by distributing the +penalty over the group of inserts. + +MongoDB is quite fast at a series of singleton inserts. Thus one often +does not need to use this specialized version of insert. + +Bulk inserts are often used with :term:`sharded collections ` and are more effective when the collection is already +populated and MongoDB has already determined the key distribution. For +more information on bulk inserts into sharded collections, see +:ref:`sharding-bulk-inserts`. + +If possible, consider using bulk inserts to insert event data. -.. todo:: import the best content from: http://www.mongodb.org/display/DOCS/Bulk+Inserts sl - split between this section and the sharded clusters section. +For more information, see :ref:`write-operations-sharded-clusters`, +:ref:`sharding-bulk-inserts`, and :doc:`/administration/import-export`. + +.. _write-operations-indexing: Indexing -------- -.. todo:: short section on the impact of indexes and index maintenance - on write operations. +After every insert, update, or delete operation, MongoDB updates not +only the collection but also *every* index associated with the +collection. Therefore, every index on a collection adds some amount of +write-performance penalty. + +In general, the performance gains that indexes realize for *read +operations* are worth the insertion penalty. But if your application is +write-heavy, be careful when creating new indexes. + +For more information, see :doc:`/source/applications/indexes`. + +.. _write-operations-isolation: Isolation --------- -- atomicity -- :doc:`/tutorial/perform-two-phase-commits` +All operations inside of a MongoDB document are atomic. An update +operation may modify more than one document at more than one level +(nesting) in a single operation that will either succeed or fail and +cannot leave the document in an in-between state. + +For more information see :doc:`Isolated write operations +` and +:doc:`/tutorial/perform-two-phase-commits`. Architecture ------------ +.. _write-operations-replica-sets: + Replica Sets ~~~~~~~~~~~~ +In :term:`replica sets `, all write operations go to the +set's :term:`primary`. MongoDB applies the write operations to the +primary and then records the operations on the primary's :term:`oplog`. +The :term:`secondary` members then replicate the oplog and apply the +operations to themselves in an asynchronous process. + +If you are performing a large data ingestion or bulk load operation that +requires a large number of writes to the primary, the secondaries might +not be able to read the oplog fast enough to keep up with changes. The +oplog is a :term:`capped collection` and overwrites its oldest entries +when it reaches a certain size. If the secondaries have not yet applied +those entries because a large write operation has prevented them from +reading the oplog, the secondaries will have fallen too far behind to +catch up and will have become stale. + +To prevent this, use :ref:`write concern +` to return write confirmation every +100, 1,000, or other designated number of operations. This provides an +opportunity for secondaries to catch up with the primary. Write concern +can slow the overall progress of write operations but prevents the +secondaries from falling too far behind. + +For more information on replica sets and write operations, see +:ref:`replica-set-write-concern`, :ref:`replica-set-oplog-sizing`, +:ref:`replica-set-oplog`, +:ref:`replica-set-procedure-change-oplog-size`, and +:ref:`replica-set-resync-stale-member`. + +.. _write-operations-sharded-clustsers: + Sharded Clusters ~~~~~~~~~~~~~~~~ + +In a :term:`sharded cluster`, MongoDB directs a given write operation to +a :term:`shard` and then performs the write on a particular +:term:`chunk` on that shard. Shards and chunks are range-based. +:term:`Shard keys ` affect how MongoDB distributes documents +among shards. Choosing the correct shard key can have a great impact on +the performance, capability, and functioning of your database and +cluster. + +For more information, see :doc:`/administration/sharding` and +:ref:`write-operations-bulk-insert`. \ No newline at end of file diff --git a/source/administration/replica-sets.txt b/source/administration/replica-sets.txt index fd97da14d4c..7707d463ee9 100644 --- a/source/administration/replica-sets.txt +++ b/source/administration/replica-sets.txt @@ -558,6 +558,18 @@ the oplog. For a detailed procedure, see .. include:: /includes/procedure-change-oplog-size.rst +.. _replica-set-resync-stale-member: + +Resyncing a Member of a Replica Set +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +When a member's data falls too far behind the :term:`oplog` to catch up, +the member and it's data are considered "stale". A member's data is too +far behind when the oplog on the :term:`primary` has overwritten its +entries before the member has copied them. When that occurs, you must +resync the member by removing its data and replacing it with up-to-date +data. + .. _replica-set-security: Replica Set Security diff --git a/source/administration/sharding.txt b/source/administration/sharding.txt index 15c2435042f..aaaa6f67467 100644 --- a/source/administration/sharding.txt +++ b/source/administration/sharding.txt @@ -960,6 +960,14 @@ run this operation from a driver that does not have helper functions: db.settings.update( { _id: "balancer" }, { $set : { stopped: false } } , true ); +.. index:: bulk insert +.. _sharding-bulk-inserts: + +Bulk Inserts and Sharding +------------------------- + +Bulk inserts let you insert many documents in a single database call. + .. index:: config servers; operations .. _sharding-procedure-config-server: From 995d1afff10a4d9f9cf8a76387b839f73f5aa91f Mon Sep 17 00:00:00 2001 From: Bob Grabar Date: Thu, 25 Oct 2012 17:44:22 -0400 Subject: [PATCH 2/5] DOCS-249 minor edits --- draft/core/write-operations.txt | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/draft/core/write-operations.txt b/draft/core/write-operations.txt index 007454083fe..12a1bc7e0c5 100644 --- a/draft/core/write-operations.txt +++ b/draft/core/write-operations.txt @@ -25,7 +25,7 @@ database, see the following: - :doc:`/applications/update` - :doc:`/applications/delete` -For information specific :program:`mongo` shell methods used in write +For information on specific :program:`mongo` shell methods used in write operations, see the following: - :method:`db.collection.insert()` @@ -50,7 +50,7 @@ Write Concern :term:`Write concern ` confirms the success of write operations to MongoDB databases by returning an object indicating -operational success. Beginning with version 2.4, the :program:`mongo` +operational success. Beginning with version 2.2.x, the :program:`mongo` shell enables write concern by default. In previous versions the shell disabled write concern by default. @@ -71,7 +71,7 @@ received the write operation and has committed the operation to the in-memory representation of the database. This provides a simple and low-latency level of write concern and will allow your application to detect situations where the :program:`mongod` instance becomes -inaccessible or insertion errors caused by :ref:`duplicate key errors +inaccessible or detect insertion errors caused by :ref:`duplicate key errors `. You can modify the level of write concern returned by issuing the From 452811cabaab80a42e01b662f235d2817066a133 Mon Sep 17 00:00:00 2001 From: Bob Grabar Date: Fri, 26 Oct 2012 14:39:57 -0400 Subject: [PATCH 3/5] DOCS-249 write operations: draft 2 --- draft/core/write-operations.txt | 132 ++++++++++++++++++-------------- 1 file changed, 76 insertions(+), 56 deletions(-) diff --git a/draft/core/write-operations.txt b/draft/core/write-operations.txt index 12a1bc7e0c5..f8b86719b6e 100644 --- a/draft/core/write-operations.txt +++ b/draft/core/write-operations.txt @@ -45,26 +45,23 @@ documentation for your client library. Write Concern ------------- -.. DONE The information on the replica-set page has now been split between - this page and the replica-set page. - :term:`Write concern ` confirms the success of write operations to MongoDB databases by returning an object indicating operational success. Beginning with version 2.2.x, the :program:`mongo` -shell enables write concern by default. In previous versions the shell -disabled write concern by default. +shell and MongoDB drivers enable write concern by default. Prior to +2.2.x, the shell disables write concern by default, while the behavior +for drivers varies. For your driver's behavior, see the +:doc:`/applications/drivers` documentation. -Many drivers also enable write concern by default. For information on -your driver, see the :doc:`/applications/drivers` documentation. .. todo add note about all drivers after `date` will have w:1 write concern for all operations by default. Write concern issues the :dbcommand:`getLastError` command after write -operations, and the command returns an object with success information. -The returned object's ``err`` field contains either a value of ``null`` -to indicate no errors, in which case the write operations have completed -successfully, or contains a description of the last error encountered. +operations to return an object with success information. The returned +object's ``err`` field contains either a value of ``null``, which +indicates write operations have completed successfully, or contains a +description of the last error encountered. A successful write operation means the :program:`mongod` instance received the write operation and has committed the operation to the @@ -74,73 +71,85 @@ detect situations where the :program:`mongod` instance becomes inaccessible or detect insertion errors caused by :ref:`duplicate key errors `. +Write concern provides confirmation of write operations but also adds to +performance costs. In situations where confirmation is unnecessary, it +can be advantageous to disable write concern. + You can modify the level of write concern returned by issuing the :dbcommand:`getLastError` command with one or both of following options: - ``j`` or "journal" option - In addition to the default confirmation, this option confirms that the - :program:`mongod` instance has written the data to the on-disk - journal. This ensures the data is durable if :program:`mongod` or the - server itself crashes or shuts down unexpectedly. + This option confirms that the :program:`mongod` instance has written + the data to the on-disk journal and ensures data is not lost if the + :program:`mongod` instance shuts down unexpectedly. Set to ``true`` to + enable, as shown in the following example: + + .. code-block:: javascript + + db.runCommand( { getLastError: 1, j: "true" } ) - ``w`` option This option is used either to configure write concern on members of - :term:`replica sets ` or to disable write concern. By - default, the ``w`` option is set to ``1``, which enables write - concern on a single :program:`mongod` instance. + :term:`replica sets ` *or* to disable write concern + entirely. By default, the ``w`` option is set to ``1``, which enables + write concern on a single :program:`mongod` instance or on the + :term:`primary` in a replica set. - In the case of replica sets, the value of ``1`` enables write concern - on the :term:`primary` only. To configure write concern to confirm - that writes have replicated to a specified number of replica set - members, see :ref:`Write Concern for Replica Sets - `. + The ``w`` option takes the following values: - To disable write concern, set the ``w`` option to ``0``, as shown in - the following example: + - ``-1`` - .. code-block:: javascript + Turns off reporting of network errors. - db.runCommand( { getLastError: 1, w: 0 } ) + - ``0`` - .. note:: Write concern provides confirmation of write operations but also adds - to performance costs. In situations where confirmation is - unnecessary, it can be advantageous to disable write concern. + Disables write concern. -.. _write-operations-bulk-insert: + .. note:: If you disable write concern but enable the journal + option, as shown here: -Bulk Inserts ------------- + .. code-block:: javascript -Bulk inserts let you insert many documents in a single database call. + { getLastError: 1, w: 0, j: "true" } -Bulk inserts allow MongoDB to distribute the performance penalty when -performing inserts to a large number of documents at once. Bulk inserts -let you pass multiple events to the :method:`insert()` method at once. -All write concern options apply to bulk inserts. + The setting with the ``j`` option prevails. Write concern is + enabled with journaling. -You perform bulk inserts through your driver. See the -:doc:`/applications/drivers` documentation for your driver for how to do -bulk inserts. + - ``1`` -Beginning with version 2.2, you also can perform bulk inserts through -the :program:`mongo` shell. + Enables write concern on a standalone :program:`mongod` or the + :term:`primary` in a replica set. -Beginning with version 2.0, you can set the ``ContinueOnError`` flag for -bulk inserts to signal inserts should continue even if one or more from -the batch fails. In that case, if multiple errors occur, only the most -recent is reported by the :dbcommand:`getLastError` command. For a -:term:`sharded collection`, ``ContinueOnError`` is implied and cannot be -disabled. + - *A number greater than 1* + + Confirms that write operations have replicated to the specified + number of replica set members, including the primary. If you set + ``w`` to a number that is greater than the number of set members + that hold data, MongoDB waits for the non-existent members become + available, which means MongoDB blocks indefinitely. + + - ``majority`` + + Confirms that write operations have replicated to the majority of + set members. + +For more information on write concern and replica sets, see :ref:`Write +Concern for Replica Sets `. + +.. _write-operations-bulk-insert: + +Bulk Inserts +------------ -If you insert data without write concern, the bulk insert gain might be -insignificant. But if you insert data with write concern configured, -bulk insert can bring significant performance gains by distributing the -penalty over the group of inserts. +Bulk inserts let you insert many documents at once in a single database +call by letting you pass multiple documents to a single insert +operation. -MongoDB is quite fast at a series of singleton inserts. Thus one often -does not need to use this specialized version of insert. +Bulk insert can significantly increase performance by distributing +:ref:`write concern ` costs. Beginning +in version 2.2.x, write concern is enabled by default. Bulk inserts are often used with :term:`sharded collections ` and are more effective when the collection is already @@ -148,7 +157,18 @@ populated and MongoDB has already determined the key distribution. For more information on bulk inserts into sharded collections, see :ref:`sharding-bulk-inserts`. -If possible, consider using bulk inserts to insert event data. +When performing bulk inserts through a driver, you can use the +``ContinueOnError`` option in your driver's insert command to continue +to insert remaining documents even if an insert fails. This option is +available in MongoDB versions 2.0 and higher. If errors occur, only the +most recent is reported. For a :term:`sharded collection`, +``ContinueOnError`` is implied and cannot be disabled. For details on +performing bulk inserts through your driver, see the +:doc:`/applications/drivers` documentation for your driver. + +Beginning with version 2.2, you can perform bulk inserts through the +:program:`mongo` shell by passing an array of documents to the +:method:`insert() ` method. For more information, see :ref:`write-operations-sharded-clusters`, :ref:`sharding-bulk-inserts`, and :doc:`/administration/import-export`. From d2af78dd6521c6007369e5f3ab4dd6209e9cfdcd Mon Sep 17 00:00:00 2001 From: Bob Grabar Date: Fri, 26 Oct 2012 17:01:40 -0400 Subject: [PATCH 4/5] DOCS-249 removed edit to Shard Admin doc --- draft/core/write-operations.txt | 6 +++++- source/administration/sharding.txt | 8 -------- 2 files changed, 5 insertions(+), 9 deletions(-) diff --git a/draft/core/write-operations.txt b/draft/core/write-operations.txt index f8b86719b6e..9ed72f038e2 100644 --- a/draft/core/write-operations.txt +++ b/draft/core/write-operations.txt @@ -155,7 +155,11 @@ Bulk inserts are often used with :term:`sharded collections ` and are more effective when the collection is already populated and MongoDB has already determined the key distribution. For more information on bulk inserts into sharded collections, see -:ref:`sharding-bulk-inserts`. +:doc:`/source/administration/sharding`. + +.. todo Chnange the above link from :doc:`/source/administration/sharding` + to :ref:`sharding-bulk-inserts` once the Write Operations document + goes live When performing bulk inserts through a driver, you can use the ``ContinueOnError`` option in your driver's insert command to continue diff --git a/source/administration/sharding.txt b/source/administration/sharding.txt index aaaa6f67467..15c2435042f 100644 --- a/source/administration/sharding.txt +++ b/source/administration/sharding.txt @@ -960,14 +960,6 @@ run this operation from a driver that does not have helper functions: db.settings.update( { _id: "balancer" }, { $set : { stopped: false } } , true ); -.. index:: bulk insert -.. _sharding-bulk-inserts: - -Bulk Inserts and Sharding -------------------------- - -Bulk inserts let you insert many documents in a single database call. - .. index:: config servers; operations .. _sharding-procedure-config-server: From 60ba4d569b978b33c56da4b8b0975f07858df3a2 Mon Sep 17 00:00:00 2001 From: Bob Grabar Date: Mon, 29 Oct 2012 13:58:55 -0400 Subject: [PATCH 5/5] DOCS-249 removed edit to Rep Set Admin doc --- draft/core/write-operations.txt | 6 +++++- source/administration/replica-sets.txt | 12 ------------ 2 files changed, 5 insertions(+), 13 deletions(-) diff --git a/draft/core/write-operations.txt b/draft/core/write-operations.txt index 9ed72f038e2..93cfa158693 100644 --- a/draft/core/write-operations.txt +++ b/draft/core/write-operations.txt @@ -241,7 +241,11 @@ For more information on replica sets and write operations, see :ref:`replica-set-write-concern`, :ref:`replica-set-oplog-sizing`, :ref:`replica-set-oplog`, :ref:`replica-set-procedure-change-oplog-size`, and -:ref:`replica-set-resync-stale-member`. + +.. todo add this :ref:`replica-set-resync-stale-member` WHEN + the "Resyncing a Member of a Replica Set" topic is added to + source/administration/replica-sets.txt. + (See pull request "DOCS-449 resync stale replica set member") .. _write-operations-sharded-clustsers: diff --git a/source/administration/replica-sets.txt b/source/administration/replica-sets.txt index 7707d463ee9..fd97da14d4c 100644 --- a/source/administration/replica-sets.txt +++ b/source/administration/replica-sets.txt @@ -558,18 +558,6 @@ the oplog. For a detailed procedure, see .. include:: /includes/procedure-change-oplog-size.rst -.. _replica-set-resync-stale-member: - -Resyncing a Member of a Replica Set -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -When a member's data falls too far behind the :term:`oplog` to catch up, -the member and it's data are considered "stale". A member's data is too -far behind when the oplog on the :term:`primary` has overwritten its -entries before the member has copied them. When that occurs, you must -resync the member by removing its data and replacing it with up-to-date -data. - .. _replica-set-security: Replica Set Security