Skip to content

DOCS-139 #9

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 9, 2012
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
85 changes: 58 additions & 27 deletions source/administration/monitoring.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,8 @@ shard clusters.
- :doc:`/reference/database-statistics`
- :doc:`/reference/collection-statistics`

TODO: Do we want to include printShardingStatus?

MongoDB provides a :ref:`REST interface <rest-interface>` that
displays an overview of this data in web-access able interface.

Expand Down Expand Up @@ -74,7 +76,7 @@ that activity and use match expectations.

.. seealso:: ":doc:`/reference/mongotop`."

:program:`monostat`
:program:`mongostat`
```````````````````

:program:`mongostat` captures and returns counters of database
Expand All @@ -96,7 +98,8 @@ and monitoring information in a simple web page. Enable this by
setting :setting:`rest` to ``true``, and access this page via the
local host interface using the port numbered 1000 more than that the
database port. In default configurations the REST interface is
accessible on ``28017``.
accessible on ``28017``. For example, to access the REST interface on a
locally running mongod instance: http://localhost:28017

Statistics
~~~~~~~~~~
Expand All @@ -114,8 +117,8 @@ serverStatus
Access :doc:`serverStatus data </reference/server-status/>` by way of
the :dbcommand:`serverStatus` command. This :term:`JSON document`
contains a general overview of the state of the database, including
disk usage, memory use, connection, journaling, access. The command
returns quickly and does not impact MongoDB performance.
disk usage, memory use, connection, journaling, index accesses. The
command returns quickly and does not impact MongoDB performance.

While this output contains a (nearly) complete account of the state of
a MongoDB instance, in most cases you will not run this command
Expand Down Expand Up @@ -146,7 +149,9 @@ a document that contains data reflecting the amount of storage used
and data contained in the database, as well as object, collection, and
index counters among other relevant information. Use this data to
track the state and size of a specific database, to compare
utilization between databases, or to determine average object size.
utilization between databases, to determine average object size.

TODO: clarify the last sentence.

.. seealso:: ":func:`db.stats()`" and
":doc:`/reference/database-statistics`."
Expand Down Expand Up @@ -179,6 +184,10 @@ other situations, performance issues may indicate that the database
may be operating at capacity and that it's time to add additional
capacity to the database.

TODO: what about mentioning poor/inapropriate indexing, bad schema
design or data access patterns


Locks
~~~~~

Expand All @@ -189,7 +198,7 @@ related slow downs can be intermittent, look to the data in the
:ref:`globalLock` section of the :dbcommand:`serverStatus` response to
asses if the lock has been a challenge to your performance. If
:status:`globalLock.currentQueue.total` is consistently high, then
there is a chance that a large number of requests waiting for a
there is a chance that a large number of requests are waiting for a
lock. This indicates a possible concurrency issue that might effect
performance.

Expand All @@ -209,8 +218,8 @@ Memory Usage
Because MongoDB uses memory mapped files to store data, given a data
set of sufficient size, the MongoDB process will allocate all memory
available on the system for its use. Because of the way operating
systems, the amount of allocated RAM is not a useful reflection of
MongoDB's state.
systems function, the amount of allocated RAM is not a useful reflection
of MongoDB's state.

While this is part of the design, and affords MongoDB superior
performance, the memory mapped files make it difficult to determine if
Expand All @@ -219,12 +228,12 @@ the amount of RAM is sufficient for the data set. Consider
MongoDB's memory utilization. Check the resident memory use
(i.e. :status:`mem.resident`:) if this exceeds the amount of system
memory *and* there's a significant amount of data on disk that isn't
in RAM, you have exceeded the capacity of your system.
in RAM, you may have exceeded the capacity of your system.

Also check the amount of mapped memory (i.e. :status:`mem.mapped`.) If
this value is greater than the amount of system memory, some
operations will require disk access to read data from virtual memory
with deleterious effects on performance.
operations will require disk access :term:`page faults` to read data
from virtual memory with deleterious effects on performance.

.. _administration-monitoring-page-faults:

Expand All @@ -237,11 +246,13 @@ check for page faults, see the :status:`extra_info.page_faults` value
in the :dbcommand:`serverStatus` command. This data is only available
on Linux systems.

Alone page faults minor and complete quickly; however, in aggregate,
Alone, page faults are minor and complete quickly; however, in aggregate,
large numbers of page fault typically indicate that MongoDB is reading
too much data from disk and can indicate a number of underlying causes
and recommendations.

TODO: mention MongoDB's use of yield on fault to alleviate this concurrency issue

If possible, increasing the amount of RAM accessible to MongoDB may
help reduce the number of page faults. If this is not possible, for
some deployments consider increasing the size of your :term:`replica
Expand All @@ -250,11 +261,13 @@ the replica sets; for other deployments, add one or more :term:`shards
<shard>` to a :term:`shard cluster` to distribute load among MongoDB
instances.

TODO: Not sure we want to mention secondary reads as a solution for page faults.

Number of Connections
~~~~~~~~~~~~~~~~~~~~~

In some cases, the number of connections between the application layer
(i.e. clients) and the database, this can overwhelm the ability of the
(i.e. clients) and the database can overwhelm the ability of the
server to handle requests which can produce performance
irregularities. Check the following fields in the :doc:`serverStatus
</reference/server-status>` document:
Expand All @@ -270,10 +283,13 @@ irregularities. Check the following fields in the :doc:`serverStatus
- :status:`connections.available` the total number of unused
collections available for new clients.

TODO: Mention the ulimit setting which can limit the max number of conns
TOOD: Mention the max number of conns is 20k

If requests are high because there are many concurrent application
requests, the database may have trouble keeping up with demand. If
this is the case, then you will need increase the capacity of your
deployment. For read-heavy applications Increase the size of your
this is the case, then you will need to increase the capacity of your
deployment. For read-heavy applications increase the size of your
:term:`replica set` and distribute read operations to
:term:`secondary` members. For write heavy applications, deploy
:term:`sharding` and add one or more :term:`shards <shard>` to a
Expand All @@ -285,6 +301,8 @@ application or driver errors. Extremely high numbers of connections,
particularly without corresponding workload is often indicative of a
driver or other configuration error.

TODO: Mention connection pooling in the 10gen official drivers..

.. _database-profiling:

Database Profiling
Expand All @@ -299,8 +317,13 @@ inefficient queries and operations. Enable the profiler by setting the

db.setProfilingLevel(1)

TODO: Add examples of setting profile level with ms:w
TODO: mention ability to getProfilingLevel() and that it doesn't return the ms...

The following profiling levels are available:

TODO: Can you make this chart render with one line per row?

========= ==================================
**Level** **Setting**
--------- ----------------------------------
Expand All @@ -309,6 +332,8 @@ The following profiling levels are available:
2 On. Includes all operations.
========= ==================================

TODO: make this note have a line break b/w chart and note

.. note::

Because the database profiler can have an impact on the
Expand All @@ -319,6 +344,8 @@ The following profiling levels are available:
setting will not propagate across a :term:`replica set` or
:term:`shard cluster`.

TODO: mention that profiling does not persist b/w restarts for any mongod

See the output of the profiler in the ``system.profile`` collection of
your database. You can specify the :setting:`slowms` to set a
threshold above which the profiler considers operations "slow" and
Expand All @@ -343,6 +370,8 @@ Ensure that the value specified here (i.e. ``100``) is above the

.. _replica-set-monitoring:

TODO: Is there more info on profiling?

Replication and Monitoring
--------------------------

Expand All @@ -359,9 +388,10 @@ however, as replication lag grows two significant problems emerge:
integrity of your data set.

- Second, if the replication lag exceeds the length of the operation
log (":term:`oplog`") then secondary will have to resync from the
master. In normal circumstances this is uncommon given the typical
size of the oplog, but presents a major problem.
log (":term:`oplog`") then the secondary will have to resync all data
from the :term:`primary` and rebuild all indexes. In normal
circumstances this is uncommon given the typical size of the oplog,
but presents a major problem.

Replication issues are most often the result of network connectivity
issues between members or a :term:`primary` instance that does not
Expand All @@ -378,10 +408,12 @@ depth overview view of this output. In general watch the value of
:status:`optimeDate`. Pay particular attention to the difference in
time between the :term:`primary` and the :term:`secondary` members.

TODO: This needs to be reworked. It's not configurable at runtime after
the first run: http://www.mongodb.org/display/DOCS/Replication+Oplog+Length
The size of the operation log is configurable at runtime using the
:option:`--oplogSize <mongod --oplogSize>` argument to the
:program:`mongod` command, or preferably the :setting:`oplogSize` in
the MongoDB configuration file. The default size, is typically 5% of
the MongoDB configuration file. The default size, is 5% of total available
disk space on 64-bit systems.

.. seealso:: ":doc:`/tutorial/change-oplog-size`"
Expand Down Expand Up @@ -417,10 +449,10 @@ Balancing and Chunk Distribution
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The most effective :term:`shard clusters <shard cluster>` require that
:term:`chunks <chunk>` migrate between the shards. MongoDB has a background
:term:`balancer` process that distributes data such that chunks are
always optimally distributed among the :term:`shards <shard>`. Issue
the :func:`db.printShardingStatus()` or :func:`sh.status()`
:term:`chunks <chunk>` are evenly balanaced between the shards. MongoDB
has a background :term:`balancer` process that distributes data such that
chunks are always optimally distributed among the :term:`shards <shard>`.
Issue the :func:`db.printShardingStatus()` or :func:`sh.status()`
command to the :program:`mongos` by way of the :program:`mongo`
shell. This returns an overview of the shard cluster including the
database name, and a list of the chunks.
Expand All @@ -433,16 +465,15 @@ released when they become stale. However, because any long lasting
lock can block future balancing, it's important to insure that all
locks are legitimate. To check the lock status of the database,
connect to a :program:`mongos` instance using the :program:`mongo`
shell connected to one of the configuration server. Issue the
following command sequence to switch to the ``config`` database and
display all outstanding locks on the shard database:
shell. Issue the following command sequence to switch to the
``config`` database and display all outstanding locks on the shard database:

.. code-block:: javascript

use config
db.locks.find()

For active deployments, the above query might return an useful result
For active deployments, the above query might return a useful result
set. The balancing process, which originates on a randomly selected
:program:`mongos`, takes a special "balancer" lock that prevents other
balancing activity from transpiring. Use the following command, also
Expand Down
15 changes: 8 additions & 7 deletions source/reference/database-statistics.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,17 +8,17 @@ Database Statistics Reference
Synopsis
--------

MongoDB can report data reflecting the current state of the current
database. In this context "database," refers to a single MongoDB
database. To run :dbcommand:`dbStats` issue a command in the shell that
resembles the following:
TODO: not so sure about my working change below...
MongoDB can report data reflecting the current state of the currently
"active" database. In this context "database," refers to a single MongoDB
database. To run :dbcommand:`dbStats` issue this command in the shell:

.. code-block:: javascript

db.runCommand( { dbStats: 1 } )

The :program:`mongo` shell provides the :func:`db.stats()` as a
helper. Use the following form:
The :program:`mongo` shell provides the helper function :func:`db.stats()`.
Use the following form:

.. code-block:: javascript

Expand Down Expand Up @@ -66,7 +66,7 @@ Fields

.. stats:: avgObjSize

The average size of each object. The scale factor affects this
The average size of each object. The scaling factor affects this
value.

.. stats:: dataSize
Expand Down Expand Up @@ -104,6 +104,7 @@ Fields

.. stats:: nsSizeMB

TODO: this definition isn't right
The total size of the data database files (i.e. that end with
``.ns``). This includes preallocated space and the :term:`padding
factor`.
Loading