diff --git a/source/administration/monitoring.rst b/source/administration/monitoring.rst index 9ac40fa5592..70b2f7583e9 100644 --- a/source/administration/monitoring.rst +++ b/source/administration/monitoring.rst @@ -30,6 +30,8 @@ shard clusters. - :doc:`/reference/database-statistics` - :doc:`/reference/collection-statistics` + TODO: Do we want to include printShardingStatus? + MongoDB provides a :ref:`REST interface ` that displays an overview of this data in web-access able interface. @@ -74,7 +76,7 @@ that activity and use match expectations. .. seealso:: ":doc:`/reference/mongotop`." -:program:`monostat` +:program:`mongostat` ``````````````````` :program:`mongostat` captures and returns counters of database @@ -96,7 +98,8 @@ and monitoring information in a simple web page. Enable this by setting :setting:`rest` to ``true``, and access this page via the local host interface using the port numbered 1000 more than that the database port. In default configurations the REST interface is -accessible on ``28017``. +accessible on ``28017``. For example, to access the REST interface on a +locally running mongod instance: http://localhost:28017 Statistics ~~~~~~~~~~ @@ -114,8 +117,8 @@ serverStatus Access :doc:`serverStatus data ` by way of the :dbcommand:`serverStatus` command. This :term:`JSON document` contains a general overview of the state of the database, including -disk usage, memory use, connection, journaling, access. The command -returns quickly and does not impact MongoDB performance. +disk usage, memory use, connection, journaling, index accesses. The +command returns quickly and does not impact MongoDB performance. While this output contains a (nearly) complete account of the state of a MongoDB instance, in most cases you will not run this command @@ -146,7 +149,9 @@ a document that contains data reflecting the amount of storage used and data contained in the database, as well as object, collection, and index counters among other relevant information. Use this data to track the state and size of a specific database, to compare -utilization between databases, or to determine average object size. +utilization between databases, to determine average object size. + +TODO: clarify the last sentence. .. seealso:: ":func:`db.stats()`" and ":doc:`/reference/database-statistics`." @@ -179,6 +184,10 @@ other situations, performance issues may indicate that the database may be operating at capacity and that it's time to add additional capacity to the database. +TODO: what about mentioning poor/inapropriate indexing, bad schema +design or data access patterns + + Locks ~~~~~ @@ -189,7 +198,7 @@ related slow downs can be intermittent, look to the data in the :ref:`globalLock` section of the :dbcommand:`serverStatus` response to asses if the lock has been a challenge to your performance. If :status:`globalLock.currentQueue.total` is consistently high, then -there is a chance that a large number of requests waiting for a +there is a chance that a large number of requests are waiting for a lock. This indicates a possible concurrency issue that might effect performance. @@ -209,8 +218,8 @@ Memory Usage Because MongoDB uses memory mapped files to store data, given a data set of sufficient size, the MongoDB process will allocate all memory available on the system for its use. Because of the way operating -systems, the amount of allocated RAM is not a useful reflection of -MongoDB's state. +systems function, the amount of allocated RAM is not a useful reflection +of MongoDB's state. While this is part of the design, and affords MongoDB superior performance, the memory mapped files make it difficult to determine if @@ -219,12 +228,12 @@ the amount of RAM is sufficient for the data set. Consider MongoDB's memory utilization. Check the resident memory use (i.e. :status:`mem.resident`:) if this exceeds the amount of system memory *and* there's a significant amount of data on disk that isn't -in RAM, you have exceeded the capacity of your system. +in RAM, you may have exceeded the capacity of your system. Also check the amount of mapped memory (i.e. :status:`mem.mapped`.) If this value is greater than the amount of system memory, some -operations will require disk access to read data from virtual memory -with deleterious effects on performance. +operations will require disk access :term:`page faults` to read data +from virtual memory with deleterious effects on performance. .. _administration-monitoring-page-faults: @@ -237,11 +246,13 @@ check for page faults, see the :status:`extra_info.page_faults` value in the :dbcommand:`serverStatus` command. This data is only available on Linux systems. -Alone page faults minor and complete quickly; however, in aggregate, +Alone, page faults are minor and complete quickly; however, in aggregate, large numbers of page fault typically indicate that MongoDB is reading too much data from disk and can indicate a number of underlying causes and recommendations. +TODO: mention MongoDB's use of yield on fault to alleviate this concurrency issue + If possible, increasing the amount of RAM accessible to MongoDB may help reduce the number of page faults. If this is not possible, for some deployments consider increasing the size of your :term:`replica @@ -250,11 +261,13 @@ the replica sets; for other deployments, add one or more :term:`shards ` to a :term:`shard cluster` to distribute load among MongoDB instances. +TODO: Not sure we want to mention secondary reads as a solution for page faults. + Number of Connections ~~~~~~~~~~~~~~~~~~~~~ In some cases, the number of connections between the application layer -(i.e. clients) and the database, this can overwhelm the ability of the +(i.e. clients) and the database can overwhelm the ability of the server to handle requests which can produce performance irregularities. Check the following fields in the :doc:`serverStatus ` document: @@ -270,10 +283,13 @@ irregularities. Check the following fields in the :doc:`serverStatus - :status:`connections.available` the total number of unused collections available for new clients. +TODO: Mention the ulimit setting which can limit the max number of conns +TOOD: Mention the max number of conns is 20k + If requests are high because there are many concurrent application requests, the database may have trouble keeping up with demand. If -this is the case, then you will need increase the capacity of your -deployment. For read-heavy applications Increase the size of your +this is the case, then you will need to increase the capacity of your +deployment. For read-heavy applications increase the size of your :term:`replica set` and distribute read operations to :term:`secondary` members. For write heavy applications, deploy :term:`sharding` and add one or more :term:`shards ` to a @@ -285,6 +301,8 @@ application or driver errors. Extremely high numbers of connections, particularly without corresponding workload is often indicative of a driver or other configuration error. +TODO: Mention connection pooling in the 10gen official drivers.. + .. _database-profiling: Database Profiling @@ -299,8 +317,13 @@ inefficient queries and operations. Enable the profiler by setting the db.setProfilingLevel(1) +TODO: Add examples of setting profile level with ms:w +TODO: mention ability to getProfilingLevel() and that it doesn't return the ms... + The following profiling levels are available: +TODO: Can you make this chart render with one line per row? + ========= ================================== **Level** **Setting** --------- ---------------------------------- @@ -309,6 +332,8 @@ The following profiling levels are available: 2 On. Includes all operations. ========= ================================== +TODO: make this note have a line break b/w chart and note + .. note:: Because the database profiler can have an impact on the @@ -319,6 +344,8 @@ The following profiling levels are available: setting will not propagate across a :term:`replica set` or :term:`shard cluster`. +TODO: mention that profiling does not persist b/w restarts for any mongod + See the output of the profiler in the ``system.profile`` collection of your database. You can specify the :setting:`slowms` to set a threshold above which the profiler considers operations "slow" and @@ -343,6 +370,8 @@ Ensure that the value specified here (i.e. ``100``) is above the .. _replica-set-monitoring: +TODO: Is there more info on profiling? + Replication and Monitoring -------------------------- @@ -359,9 +388,10 @@ however, as replication lag grows two significant problems emerge: integrity of your data set. - Second, if the replication lag exceeds the length of the operation - log (":term:`oplog`") then secondary will have to resync from the - master. In normal circumstances this is uncommon given the typical - size of the oplog, but presents a major problem. + log (":term:`oplog`") then the secondary will have to resync all data + from the :term:`primary` and rebuild all indexes. In normal + circumstances this is uncommon given the typical size of the oplog, + but presents a major problem. Replication issues are most often the result of network connectivity issues between members or a :term:`primary` instance that does not @@ -378,10 +408,12 @@ depth overview view of this output. In general watch the value of :status:`optimeDate`. Pay particular attention to the difference in time between the :term:`primary` and the :term:`secondary` members. +TODO: This needs to be reworked. It's not configurable at runtime after +the first run: http://www.mongodb.org/display/DOCS/Replication+Oplog+Length The size of the operation log is configurable at runtime using the :option:`--oplogSize ` argument to the :program:`mongod` command, or preferably the :setting:`oplogSize` in -the MongoDB configuration file. The default size, is typically 5% of +the MongoDB configuration file. The default size, is 5% of total available disk space on 64-bit systems. .. seealso:: ":doc:`/tutorial/change-oplog-size`" @@ -417,10 +449,10 @@ Balancing and Chunk Distribution ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The most effective :term:`shard clusters ` require that -:term:`chunks ` migrate between the shards. MongoDB has a background -:term:`balancer` process that distributes data such that chunks are -always optimally distributed among the :term:`shards `. Issue -the :func:`db.printShardingStatus()` or :func:`sh.status()` +:term:`chunks ` are evenly balanaced between the shards. MongoDB +has a background :term:`balancer` process that distributes data such that +chunks are always optimally distributed among the :term:`shards `. +Issue the :func:`db.printShardingStatus()` or :func:`sh.status()` command to the :program:`mongos` by way of the :program:`mongo` shell. This returns an overview of the shard cluster including the database name, and a list of the chunks. @@ -433,16 +465,15 @@ released when they become stale. However, because any long lasting lock can block future balancing, it's important to insure that all locks are legitimate. To check the lock status of the database, connect to a :program:`mongos` instance using the :program:`mongo` -shell connected to one of the configuration server. Issue the -following command sequence to switch to the ``config`` database and -display all outstanding locks on the shard database: +shell. Issue the following command sequence to switch to the +``config`` database and display all outstanding locks on the shard database: .. code-block:: javascript use config db.locks.find() -For active deployments, the above query might return an useful result +For active deployments, the above query might return a useful result set. The balancing process, which originates on a randomly selected :program:`mongos`, takes a special "balancer" lock that prevents other balancing activity from transpiring. Use the following command, also diff --git a/source/reference/database-statistics.rst b/source/reference/database-statistics.rst index 9f159e9fa78..7b69cdf07af 100644 --- a/source/reference/database-statistics.rst +++ b/source/reference/database-statistics.rst @@ -8,17 +8,17 @@ Database Statistics Reference Synopsis -------- -MongoDB can report data reflecting the current state of the current -database. In this context "database," refers to a single MongoDB -database. To run :dbcommand:`dbStats` issue a command in the shell that -resembles the following: +TODO: not so sure about my working change below... +MongoDB can report data reflecting the current state of the currently +"active" database. In this context "database," refers to a single MongoDB +database. To run :dbcommand:`dbStats` issue this command in the shell: .. code-block:: javascript db.runCommand( { dbStats: 1 } ) -The :program:`mongo` shell provides the :func:`db.stats()` as a -helper. Use the following form: +The :program:`mongo` shell provides the helper function :func:`db.stats()`. +Use the following form: .. code-block:: javascript @@ -66,7 +66,7 @@ Fields .. stats:: avgObjSize - The average size of each object. The scale factor affects this + The average size of each object. The scaling factor affects this value. .. stats:: dataSize @@ -104,6 +104,7 @@ Fields .. stats:: nsSizeMB +TODO: this definition isn't right The total size of the data database files (i.e. that end with ``.ns``). This includes preallocated space and the :term:`padding factor`. diff --git a/source/reference/server-status.rst b/source/reference/server-status.rst index fe0da7e88eb..4a64086fc82 100644 --- a/source/reference/server-status.rst +++ b/source/reference/server-status.rst @@ -11,6 +11,9 @@ catalogs each datum included in the output of this command and provides context for using this data to more effectively administer your database. +TODO: Maybe mention that much of this information is displayed in a dynamic +manner by the mongostat command + Basic Information ----------------- @@ -54,17 +57,20 @@ Basic Information globalLock ---------- +TODO: How will this section change in 2.2 with concurrency changes? All global +lock will fork with version + .. status:: globalLock The :status:`globalLock` data structure contains information regarding the database's current lock state, historical lock status, current operation queue, and the number of active clients. -.. status:: globalLock.toalTime +.. status:: globalLock.totalTime The value of :status:`globalLock.totalTime` represents the time, in - microseconds, since the database last started, that the - :status:`globalLock` has existed. + microseconds, since the database last started and that the + :status:`globalLock` was created. Larger values indicate that the database has been unavailable for more time; however, :status:`uptime` provides context for this @@ -262,12 +268,12 @@ extra_info .. status:: extra_info.heap_usage_bytes The :status:`extra_info.heap_usage_bytes` field is only available on - Linux systems, and relates the total size in bytes of heap space + Unix/Linux systems, and relates the total size in bytes of heap space used by the database process. .. status:: extra_info.page_faults - The :status:`extra_info.page_faults` field is only available on Linux + The :status:`extra_info.page_faults` field is only available on Unix/Linux systems, and relates the total number of page faults that require disk operations. Page faults refer to operations that require the database server to access data which isn't available in active @@ -497,10 +503,10 @@ repl See :doc:`/core/replication` for more information on replication. -optcounters +opcounters ----------- -.. status:: optcounters +.. status:: opcounters The :status:`opcounters` data structure provides an overview of database operations by type and makes it possible to analyze the @@ -509,38 +515,38 @@ optcounters These numbers will grow over time and in response to database use. Analyze these values over time to track database utilization. -.. status:: optcounters.insert +.. status:: opcounters.insert :status:`opcounters.insert` provides a counter of the total number of insert operations since the :program:`mongod` instance last started. -.. status:: optcounters.query +.. status:: opcounters.query :status:`opcounters.query` provides a counter of the total number of queries since the :program:`mongod` instance last started. -.. status:: optcounters.update +.. status:: opcounters.update :status:`opcounters.update` provides a counter of the total number of update operations since the :program:`mongod` instance last started. -.. status:: optcounters.delete +.. status:: opcounters.delete :status:`opcounters.delete` provides a counter of the total number of delete operations since the :program:`mongod` instance last started. -.. status:: optcounters.getmore +.. status:: opcounters.getmore :status:`opcounters.getmore` provides a counter of the total number of "getmore" operations since the :program:`mongod` instance last - started. On a primary node, this counter can be high even if the - query count is low. Secondary nodes send ``getMore`` operations to - the primary node as part of the replication process. + started. This counter can be high even if the query count is low. + Secondary nodes send ``getMore`` operations as part of the replication + process. -.. status:: optcounters.command +.. status:: opcounters.command :status:`opcounters.command` provides a counter of the total number of commands issued to the database since the :program:`mongod` @@ -713,3 +719,4 @@ Other Statuses The value of :status:`writeBacksQueued` is "``true``" when there are operations from a :program:`mongos` instance queued for retrying. Typically this option is false. + TODO: should we have a glossary entry for writeBacks?