From f691bec21e5dba95ac559ea8d6948812fdc7a820 Mon Sep 17 00:00:00 2001 From: Bob Grabar Date: Wed, 5 Dec 2012 15:02:05 -0500 Subject: [PATCH 1/5] DOCS-665 new optimization page --- source/administration/configuration.txt | 4 +- source/administration/monitoring.txt | 4 +- source/applications/aggregation.txt | 2 + source/applications/optimization.txt | 143 ++++++++++++++++++++++++ source/core/indexes.txt | 3 +- source/faq/indexes.txt | 3 +- source/reference/glossary.txt | 6 +- source/reference/operator/hint.txt | 2 +- 8 files changed, 156 insertions(+), 11 deletions(-) create mode 100644 source/applications/optimization.txt diff --git a/source/administration/configuration.txt b/source/administration/configuration.txt index d90cfbd1490..9d8c0e7f8d9 100644 --- a/source/administration/configuration.txt +++ b/source/administration/configuration.txt @@ -309,11 +309,9 @@ needed: - :setting:`slowms` configures the threshold for the :term:`database profiler` to consider a query "slow." The default value is 100 milliseconds. Set a lower value if the database profiler does not - return useful results. See the ":wiki:`Optimization`" wiki page + return useful results. See :doc:`/applications/optimization` for more information on optimizing operations in MongoDB. - .. STUB ":doc:`/applications/optimization`" - - :setting:`profile` sets the :term:`database profiler` level. The profiler is not active by default because of the possible impact on the profiler itself on performance. Unless this setting diff --git a/source/administration/monitoring.txt b/source/administration/monitoring.txt index 52785930664..94eae9cfc8a 100644 --- a/source/administration/monitoring.txt +++ b/source/administration/monitoring.txt @@ -405,12 +405,10 @@ This returns all operations that lasted longer than 100 milliseconds. Ensure that the value specified here (i.e. ``100``) is above the :setting:`slowms` threshold. -.. seealso:: The :wiki:`Optimization` wiki page addresses strategies +.. seealso:: :doc:`/applications/optimization` addresses strategies that may improve the performance of your database queries and operations. -.. STUB :doc:`/applications/optimization` - .. _replica-set-monitoring: Replication and Monitoring diff --git a/source/applications/aggregation.txt b/source/applications/aggregation.txt index 6dfb01c2767..2beafac692c 100644 --- a/source/applications/aggregation.txt +++ b/source/applications/aggregation.txt @@ -171,6 +171,8 @@ The aggregation operation in the previous section returns a As a document, the result is subject to the :ref:`BSON Document size ` limit, which is currently 16 megabytes. +.. _aggregation-optimize-performance: + Optimizing Performance ---------------------- diff --git a/source/applications/optimization.txt b/source/applications/optimization.txt new file mode 100644 index 00000000000..dcaeefd3b1e --- /dev/null +++ b/source/applications/optimization.txt @@ -0,0 +1,143 @@ +============ +Optimization +============ + +.. default-domain:: mongodb + +This section describes techniques for optimizing database performance. + +.. seealso:: :ref:`aggregation-optimize-performance` + +Use Indexes +----------- + +For a regularly issued query, create an index on the query's fields so +that MongoDB searches the index, not the collection. Searching an index +is much faster than searching a collection. + +For example, if you have a ``posts`` database containing blog posts, and if +you regularly issue a query that sorts on the ``timestamp`` field: + +.. code-block:: javascript + + db.posts.find().sort( { timestamp : -1 } ) + +Then your first optimization should be to create an index on the key +used for the sorting: + +.. code-block:: javascript + + db.posts.ensureIndex( { timestamp : 1 } ) + +With the new index, MongoDB can sort based on index information, rather than by accessing +each document in the collection directly. + +Indexes speed performance for any field that is part of the query +specification, including fields used by :doc:`aggregation operators +`. + +For more information on using indexes to improve performance, see +:ref:`indexes-create-to-match-queries`. + +Limit Results +------------- + +MongoDB :term:`cursors ` return results in groups of multiple +documents. If you know the number of results you want, you can reduce +the demand on network and database resources by issuing the +:method:`cursor.limit()` method. + +For example, if you need only 10 results from your previous query to the +``posts`` database, you would issue the following command: + +.. code-block:: javascript + + db.posts.find().sort( { timestamp : -1 } ).limit(10) + +For more information on limiting results, see :method:`cursor.limit()` + +Use Projections to Return Only Necessary Data +--------------------------------------------- + +When you need only certain fields from documents, you can achieve better +performance by returning only the fields you need: + +For example, if in your previous query to the ``posts`` database, you +need only the ``timestamp``, ``title``, ``author``, and ``abstract`` +fields, you would issue the following command: + +.. code-block:: javascript + + db.posts.find( {}, { timestamp : 1 , title : 1 , author : 1 , abstract : 1} ).sort( { timestamp : -1 } ).limit(10) + +For more information on using projections, see +:ref:`read-operations-projection`. + +Use the Database Profiler to Evaluate Performance +------------------------------------------------- + +.. todo Add link below to Database Profiler when that doc is migrated. + The link below to `database-profiling` is NOT the link to the DP doc. + +MongoDB includes a database profiler that shows performance +characteristics of each operation against the database. Use the profiler +to locate any queries or write operations that are running slow. You can +used this information, for example, to determine what indexes to create. + +For more information, see :ref:`database-profiling`. + +Use :method:`db.currentOp()` to Evaluate Performance +---------------------------------------------------- + +The :method:`db.currentOp()` method reports on current operations +running on a :program:`mongod` instance. For more information, see +:doc:`/reference/current-op`. + +Use :operator:`explain <$explain>` to Evaluate Performance +---------------------------------------------------------- + +Use :method:`explain ` to get more information on the +performance of your queries. + +Use :operator:`hint <$hint>` to Select a Particular Index +--------------------------------------------------------- + +In most cases the :ref:`query optimizer +` selects the best index. But +:method:`hint ` might help when you query on multiple +fields that are indexed in separate indexes. You can :method:`hint +` to select the index to use, potentially improving +performance. + +Use the Increment Operator to Perform Operations on the Server Side +------------------------------------------------------------------- + +Use MongoDB's :operator:`$inc` operator to increment fields. The +operator increments fields on the server side, which can be much faster +than updating a document on the client side. + +Perform Server Side Code Execution +---------------------------------- + +When appropriate, perform an operation on the database server to +eliminate client/server network turnarounds. For more information, see +:wiki:`Server-side+Code+Execution`. + +Use Capped Collections +---------------------- + +:doc:`/core/capped-collections` are circular fixed-size collections that +keep documents well-ordered, even without the use of an index. This +means that capped collections can receive very high-speed reads and +writes. + +These collections are particularly useful for keeping log files but are +not limited to that purpose. Use capped collections where appropriate. + +Use Natural Order +----------------- + +To return documents in the order they exist on disk, use the +:operator:`$natural` operator. :term:`Natural order ` +does not use indexes but can be fast for operations where the first or +last items on disk are required. diff --git a/source/core/indexes.txt b/source/core/indexes.txt index 9d09f7ed419..10d2ffd0fac 100644 --- a/source/core/indexes.txt +++ b/source/core/indexes.txt @@ -40,7 +40,8 @@ MongoDB indexes have the following core features: these representation of the data to optimize query responses. - Every query, including update operations, use one and only one - index. The query optimizer selects the index empirically by + index. The :ref:`query optimizer ` + selects the index empirically by occasionally running alternate query plans and by selecting the plan with the best response time for each query type. You can override the query optimizer using the :method:`cursor.hint()` method. diff --git a/source/faq/indexes.txt b/source/faq/indexes.txt index a6f5ae8f7f8..96b00a07844 100644 --- a/source/faq/indexes.txt +++ b/source/faq/indexes.txt @@ -74,7 +74,8 @@ documentation on choosing which fields to index, see :doc:`/applications/indexes`. .. todo:: FAQ How do I guarantee a query uses an index? - MongoDB's query optimizer always looks for the most advantageous + MongoDB's :ref:`query optimizer ` + always looks for the most advantageous index to use. You cannot guarantee use of a particular index, but you can write indexes with your queries in mind. For detailed documentation on creating optimal indexes, see diff --git a/source/reference/glossary.txt b/source/reference/glossary.txt index 7f3f51818cf..67b5fd35e53 100644 --- a/source/reference/glossary.txt +++ b/source/reference/glossary.txt @@ -878,8 +878,10 @@ Glossary For each query, the MongoDB query optimizer generates a query plan that matches the query to the index that produces the fastest results. The optimizer then uses the query plan each time the - :program:`mongod` receives the query. If a collection changes significantly, the optimizer - creates a new query plan. + :program:`mongod` receives the query. If a collection changes + significantly, the optimizer creates a new query plan. + + .. seealso:: :ref:`read-operations-query-optimization` diagnostic log :program:`mongod` can create a verbose log of operations with diff --git a/source/reference/operator/hint.txt b/source/reference/operator/hint.txt index 2590785e977..be120b270d3 100644 --- a/source/reference/operator/hint.txt +++ b/source/reference/operator/hint.txt @@ -6,7 +6,7 @@ $hint .. operator:: $hint - Use the :operator:`$hint` operator to force the query optimizer to + Use the :operator:`$hint` operator to force the :ref:`query optimizer ` to use a specific index to fulfill the query. Use :operator:`$hint` for testing query performance and indexing strategies. Consider the following form: From 2ee846adc7ea1a2e42a8f882945fee1cff61dc57 Mon Sep 17 00:00:00 2001 From: Bob Grabar Date: Fri, 7 Dec 2012 18:46:00 -0500 Subject: [PATCH 2/5] DOCS-665 review edits to optimization page --- source/applications/optimization.txt | 97 +++++++++++++++++----------- 1 file changed, 60 insertions(+), 37 deletions(-) diff --git a/source/applications/optimization.txt b/source/applications/optimization.txt index dcaeefd3b1e..874b9613002 100644 --- a/source/applications/optimization.txt +++ b/source/applications/optimization.txt @@ -11,44 +11,57 @@ This section describes techniques for optimizing database performance. Use Indexes ----------- -For a regularly issued query, create an index on the query's fields so -that MongoDB searches the index, not the collection. Searching an index -is much faster than searching a collection. +For a regularly issued query, create an index based on the query's +fields so that when you issue the query MongoDB can scan through the +documents on the index instead of those in the collection. Searching an +index is much faster than searching a collection. Documents on an index +are smaller, easier to parse, and already ordered. -For example, if you have a ``posts`` database containing blog posts, and if -you regularly issue a query that sorts on the ``timestamp`` field: +.. example:: If you have a ``posts`` collection containing blog posts, + and if you regularly issue a query that sorts on the ``author_name`` + field, then you can optimize the query by creating an index on the + ``author_name`` field: -.. code-block:: javascript + .. code-block:: javascript - db.posts.find().sort( { timestamp : -1 } ) + db.posts.ensureIndex( { author_name : 1 } ) -Then your first optimization should be to create an index on the key -used for the sorting: +Indexes are pre-sorted, so also improve efficiency on queries that routinely +sort on a given field. -.. code-block:: javascript +.. example:: If in your ``posts`` collection you regularly issue a query + that sorts on the ``timestamp`` field: + + .. code-block:: javascript + + db.posts.find().sort( { timestamp : -1 } ) + + Then you can optimize the query by creating an index on the + ``timestamp`` field: + + .. code-block:: javascript - db.posts.ensureIndex( { timestamp : 1 } ) + db.posts.ensureIndex( { timestamp : 1 } ) -With the new index, MongoDB can sort based on index information, rather than by accessing -each document in the collection directly. +A single query can use only one index. If you commonly query on a +combination of fields, create a :ref:`compound index +`. Indexes speed performance for any field that is part of the query specification, including fields used by :doc:`aggregation operators `. -For more information on using indexes to improve performance, see -:ref:`indexes-create-to-match-queries`. - Limit Results ------------- MongoDB :term:`cursors ` return results in groups of multiple documents. If you know the number of results you want, you can reduce -the demand on network and database resources by issuing the -:method:`cursor.limit()` method. +the demand on network resources by issuing the :method:`cursor.limit()` +method. -For example, if you need only 10 results from your previous query to the -``posts`` database, you would issue the following command: +This is typically used in conjunction with sort operations. For example, +if you need only 10 results from your query to the ``posts`` +database, you would issue the follOwing command: .. code-block:: javascript @@ -59,12 +72,12 @@ For more information on limiting results, see :method:`cursor.limit()` Use Projections to Return Only Necessary Data --------------------------------------------- -When you need only certain fields from documents, you can achieve better +When you need only a subset of fiElds from documents, you can achieve better performance by returning only the fields you need: -For example, if in your previous query to the ``posts`` database, you -need only the ``timestamp``, ``title``, ``author``, and ``abstract`` -fields, you would issue the following command: +For example, if in your query to the ``posts`` database, you need only +the ``timestamp``, ``title``, ``author``, and ``abstract`` fields, you +would issue the following command: .. code-block:: javascript @@ -79,28 +92,31 @@ Use the Database Profiler to Evaluate Performance .. todo Add link below to Database Profiler when that doc is migrated. The link below to `database-profiling` is NOT the link to the DP doc. -MongoDB includes a database profiler that shows performance +MongoDB provides a :doc:`database profiler ` that shows performance characteristics of each operation against the database. Use the profiler to locate any queries or write operations that are running slow. You can used this information, for example, to determine what indexes to create. -For more information, see :ref:`database-profiling`. +For more information, see :doc:`/tutorial/manage-the-database-profiler` +and :ref:`database-profiling`. -Use :method:`db.currentOp()` to Evaluate Performance ----------------------------------------------------- +Use db.currentOp() to Evaluate Performance +------------------------------------------ The :method:`db.currentOp()` method reports on current operations running on a :program:`mongod` instance. For more information, see :doc:`/reference/current-op`. -Use :operator:`explain <$explain>` to Evaluate Performance ----------------------------------------------------------- +Use $explain to Evaluate Performance +------------------------------------ + +Use :method:`explain ` to to return statistics on the +query, including what index MongoDB selected to fulfill the query. -Use :method:`explain ` to get more information on the -performance of your queries. +.. todo Link to Kay's new explain doc -Use :operator:`hint <$hint>` to Select a Particular Index ---------------------------------------------------------- +Use $hint to Select a Particular Index +-------------------------------------- In most cases the :ref:`query optimizer ` selects the best index. But @@ -116,6 +132,12 @@ Use MongoDB's :operator:`$inc` operator to increment fields. The operator increments fields on the server side, which can be much faster than updating a document on the client side. +Typically, these operations are also fast because they don't require: + +- Sending data between the clients. + +- Moving the record in the data store. + Perform Server Side Code Execution ---------------------------------- @@ -128,8 +150,8 @@ Use Capped Collections :doc:`/core/capped-collections` are circular fixed-size collections that keep documents well-ordered, even without the use of an index. This -means that capped collections can receive very high-speed reads and -writes. +means that capped collections can receive very high-speed writes and +sequential reads. These collections are particularly useful for keeping log files but are not limited to that purpose. Use capped collections where appropriate. @@ -140,4 +162,5 @@ Use Natural Order To return documents in the order they exist on disk, use the :operator:`$natural` operator. :term:`Natural order ` does not use indexes but can be fast for operations where the first or -last items on disk are required. +last items on disk are required, particularly to operations on capped +collections. From 6e503805d8777acdfb1eaa4f5b1ccac7e185aab2 Mon Sep 17 00:00:00 2001 From: Bob Grabar Date: Mon, 10 Dec 2012 13:47:27 -0500 Subject: [PATCH 3/5] DOCS-665 tech review edits --- source/applications/aggregation.txt | 2 + source/applications/optimization.txt | 95 ++++++++++++++++------------ 2 files changed, 56 insertions(+), 41 deletions(-) diff --git a/source/applications/aggregation.txt b/source/applications/aggregation.txt index 2beafac692c..e5edd8a06c6 100644 --- a/source/applications/aggregation.txt +++ b/source/applications/aggregation.txt @@ -181,6 +181,8 @@ Because you will always call :method:`aggregate` on a the aggregation pipeline, you may want to optimize the operation by avoiding scanning the entire collection whenever possible. +.. _aggregation-pipeline-operators-and-performance: + Pipeline Operators and Indexes ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/source/applications/optimization.txt b/source/applications/optimization.txt index 874b9613002..22da3b6bacd 100644 --- a/source/applications/optimization.txt +++ b/source/applications/optimization.txt @@ -11,11 +11,11 @@ This section describes techniques for optimizing database performance. Use Indexes ----------- -For a regularly issued query, create an index based on the query's -fields so that when you issue the query MongoDB can scan through the -documents on the index instead of those in the collection. Searching an -index is much faster than searching a collection. Documents on an index -are smaller, easier to parse, and already ordered. +For commonly issued queries, create :doc:`indexes `. If a +query searches multiple fields, create a :ref:`compound index +`. Scanning an index is much faster than scanning a +collection. Items in an index are ordered and smaller than the documents +they summarize. .. example:: If you have a ``posts`` collection containing blog posts, and if you regularly issue a query that sorts on the ``author_name`` @@ -26,30 +26,32 @@ are smaller, easier to parse, and already ordered. db.posts.ensureIndex( { author_name : 1 } ) -Indexes are pre-sorted, so also improve efficiency on queries that routinely -sort on a given field. +Indexes also improve efficiency on queries that routinely sort on a +given field. -.. example:: If in your ``posts`` collection you regularly issue a query - that sorts on the ``timestamp`` field: +.. example:: If you regularly issue a query that sorts on the + ``timestamp`` field, then you can optimize the query by creating an + index on the ``timestamp`` field: + + Creating this index: .. code-block:: javascript - db.posts.find().sort( { timestamp : -1 } ) + db.posts.ensureIndex( { timestamp : 1 } ) - Then you can optimize the query by creating an index on the - ``timestamp`` field: + Optimizes this query: .. code-block:: javascript - db.posts.ensureIndex( { timestamp : 1 } ) + db.posts.find().sort( { timestamp : -1 } ) -A single query can use only one index. If you commonly query on a -combination of fields, create a :ref:`compound index -`. +Direction on a single-key index does not matter. You can store the index +in either direction. -Indexes speed performance for any field that is part of the query -specification, including fields used by :doc:`aggregation operators -`. +In certain cases, indexes speed performance for fields used by +aggregation operators. See +:ref:`aggregation-pipeline-operators-and-performance` for more +information. Limit Results ------------- @@ -61,7 +63,7 @@ method. This is typically used in conjunction with sort operations. For example, if you need only 10 results from your query to the ``posts`` -database, you would issue the follOwing command: +collection, you would issue the following command: .. code-block:: javascript @@ -72,16 +74,16 @@ For more information on limiting results, see :method:`cursor.limit()` Use Projections to Return Only Necessary Data --------------------------------------------- -When you need only a subset of fiElds from documents, you can achieve better +When you need only a subset of fields from documents, you can achieve better performance by returning only the fields you need: -For example, if in your query to the ``posts`` database, you need only +For example, if in your query to the ``posts`` collection, you need only the ``timestamp``, ``title``, ``author``, and ``abstract`` fields, you would issue the following command: .. code-block:: javascript - db.posts.find( {}, { timestamp : 1 , title : 1 , author : 1 , abstract : 1} ).sort( { timestamp : -1 } ).limit(10) + db.posts.find( {}, { timestamp : 1 , title : 1 , author : 1 , abstract : 1} ).sort( { timestamp : -1 } ) For more information on using projections, see :ref:`read-operations-projection`. @@ -89,16 +91,16 @@ For more information on using projections, see Use the Database Profiler to Evaluate Performance ------------------------------------------------- -.. todo Add link below to Database Profiler when that doc is migrated. - The link below to `database-profiling` is NOT the link to the DP doc. +.. todo Add link below: :doc:`database profiler ` -MongoDB provides a :doc:`database profiler ` that shows performance +MongoDB provides a database profiler that shows performance characteristics of each operation against the database. Use the profiler to locate any queries or write operations that are running slow. You can used this information, for example, to determine what indexes to create. -For more information, see :doc:`/tutorial/manage-the-database-profiler` -and :ref:`database-profiling`. +.. todo Add below: , see :doc:`/tutorial/manage-the-database-profiler` and ... + +For more information, see :ref:`database-profiling`. Use db.currentOp() to Evaluate Performance ------------------------------------------ @@ -110,7 +112,7 @@ running on a :program:`mongod` instance. For more information, see Use $explain to Evaluate Performance ------------------------------------ -Use :method:`explain ` to to return statistics on the +Use :method:`explain ` to return statistics on the query, including what index MongoDB selected to fulfill the query. .. todo Link to Kay's new explain doc @@ -122,33 +124,44 @@ In most cases the :ref:`query optimizer ` selects the best index. But :method:`hint ` might help when you query on multiple fields that are indexed in separate indexes. You can :method:`hint -` to select the index to use, potentially improving +` to specify the index to use, potentially improving performance. -Use the Increment Operator to Perform Operations on the Server Side -------------------------------------------------------------------- +Use the Increment Operator to Perform Operations Server-Side +------------------------------------------------------------ Use MongoDB's :operator:`$inc` operator to increment fields. The operator increments fields on the server side, which can be much faster -than updating a document on the client side. +than updating a document on the client side. Specifically, using +:operator:`$inc` is much faster than selecting a document, incrementing +a field in your application, and then writing the entire document back +to the server. -Typically, these operations are also fast because they don't require: +.. note:: Two threads using :operator:`$inc` to increment the same value + can cause a race condition. -- Sending data between the clients. +Typically these operations also are fast because they don't require +sending data between the clients. -- Moving the record in the data store. +.. DELETED: nor moving the record in the data store, as would occur if incrementing + the value increased the size of the document on disk. -Perform Server Side Code Execution +Perform Server-Side Code Execution ---------------------------------- -When appropriate, perform an operation on the database server to -eliminate client/server network turnarounds. For more information, see -:wiki:`Server-side+Code+Execution`. +Occasionally, for maximum performance, you might want to perform an +operation on the database server to eliminate client/server network +turnarounds. For example, if you want to remove a field from all +documents in a collection, performing the operation directly on the +server is more efficient than transmitting the collection to your client +and back again. + +For more information, see :wiki:`Server-side+Code+Execution`. Use Capped Collections ---------------------- -:doc:`/core/capped-collections` are circular fixed-size collections that +:doc:`/core/capped-collections` are circular, fixed-size collections that keep documents well-ordered, even without the use of an index. This means that capped collections can receive very high-speed writes and sequential reads. From a906cab8d17d17be3047204fc601448e379f7960 Mon Sep 17 00:00:00 2001 From: Bob Grabar Date: Wed, 12 Dec 2012 14:21:45 -0500 Subject: [PATCH 4/5] DOCS-665 minor --- source/applications/optimization.txt | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/source/applications/optimization.txt b/source/applications/optimization.txt index 22da3b6bacd..c3201698325 100644 --- a/source/applications/optimization.txt +++ b/source/applications/optimization.txt @@ -137,14 +137,9 @@ than updating a document on the client side. Specifically, using a field in your application, and then writing the entire document back to the server. -.. note:: Two threads using :operator:`$inc` to increment the same value - can cause a race condition. - -Typically these operations also are fast because they don't require -sending data between the clients. - -.. DELETED: nor moving the record in the data store, as would occur if incrementing - the value increased the size of the document on disk. +The :operator:`$inc` operator also avoids race conditions, as would +occur if two threads simultaneously queried for a document, manually +incremented a field, and saved the entire document back. Perform Server-Side Code Execution ---------------------------------- @@ -156,7 +151,7 @@ documents in a collection, performing the operation directly on the server is more efficient than transmitting the collection to your client and back again. -For more information, see :wiki:`Server-side+Code+Execution`. +For more information, see :wiki:`Server-side Code Execution `. Use Capped Collections ---------------------- From 0989755cbdadc324d8ff318bd04546dca4f5d1d5 Mon Sep 17 00:00:00 2001 From: Bob Grabar Date: Thu, 13 Dec 2012 16:13:32 -0500 Subject: [PATCH 5/5] DOCS-665 type --- source/applications/optimization.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/source/applications/optimization.txt b/source/applications/optimization.txt index c3201698325..3fa8dc9018f 100644 --- a/source/applications/optimization.txt +++ b/source/applications/optimization.txt @@ -96,7 +96,7 @@ Use the Database Profiler to Evaluate Performance MongoDB provides a database profiler that shows performance characteristics of each operation against the database. Use the profiler to locate any queries or write operations that are running slow. You can -used this information, for example, to determine what indexes to create. +use this information, for example, to determine what indexes to create. .. todo Add below: , see :doc:`/tutorial/manage-the-database-profiler` and ...