diff --git a/source/applications/map-reduce.txt b/source/applications/map-reduce.txt index 8ee37896f0f..d2d73b40433 100644 --- a/source/applications/map-reduce.txt +++ b/source/applications/map-reduce.txt @@ -169,8 +169,9 @@ Run the first map-reduce operation as follows: Subsequent Incremental Map-Reduce ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Later when, the ``sessions`` collection grows, by adding the following -documents, you can run additional map-reduce operations: +Later as the ``sessions`` collection grows, you can run additional +map-reduce operations. For example, add new documents to the +``sessions`` collection: .. code-block:: javascript @@ -208,6 +209,50 @@ periodically with the same target collection name without affecting the intermediate states. Use this mode when generating statistical output collections on a regular basis. +.. _map-reduce-concurrency: + +Concurrency +----------- + +The map-reduce operation is composed of many tasks, including: + +- reads from the input collection, + +- executions of the ``map`` function, + +- executions of the ``reduce`` function, + +- writes to the output collection. + +These various tasks take the following locks: + +- The read phase takes a read lock. It yields every 100 documents. + +- The JavaScript code (i.e. ``map``, ``reduce``, ``finalize`` + functions) is executed in a single thread, taking a JavaScript lock; + however, most JavaScript tasks in map-reduce are very short and + yield the lock frequently. + +- The insert into the temporary collection takes a write lock for a + single write. + + If the output collection does not exist, the creation of the output + collection takes a write lock. + + If the output collection exists, then the output actions (i.e. + ``merge``, ``replace``, ``reduce``) take a write lock. + +Although single-threaded, the map-reduce tasks interleave and appear to +run in parallel. + +.. note:: + + The final write lock during post-processing makes the results appear + atomically. However, output actions ``merge`` and ``reduce`` may + take minutes to process. For the ``merge`` and ``reduce``, the + ``nonAtomic`` flag is available. See the + :method:`db.collection.mapReduce()` reference for more information. + .. _map-reduce-sharded-cluster: Sharded Cluster @@ -271,10 +316,10 @@ In MongoDB 2.0: .. warning:: - For best results only use the sharded output options for + For best results, only use the sharded output options for :dbcommand:`mapReduce` in version 2.2 or later. -Troubleshooting Map Reduce Operations +Troubleshooting Map-Reduce Operations ------------------------------------- You can troubleshoot the ``map`` function and the ``reduce`` function diff --git a/source/includes/parameters-map-reduce.rst b/source/includes/parameters-map-reduce.rst index ed946a0801c..7996797f499 100644 --- a/source/includes/parameters-map-reduce.rst +++ b/source/includes/parameters-map-reduce.rst @@ -182,12 +182,13 @@ .. versionadded:: 2.1 Optional. Specify output operation as non-atomic and is - valid *only* for ``merge`` and ``reduce`` output modes. + valid *only* for ``merge`` and ``reduce`` output modes which + may take minutes to execute. If ``nonAtomic`` is ``true``, the post-processing step will prevent MongoDB from locking the database; however, other clients will be able to read intermediate states of the - output database. Otherwise the map reduce operation must + output collection. Otherwise the map reduce operation must lock the database during post-processing. - **Output inline**. Perform the map-reduce operation in memory