diff --git a/source/core/sharded-clusters.txt b/source/core/sharded-clusters.txt
index ddbe22ffefa..2a0b9e93b29 100644
--- a/source/core/sharded-clusters.txt
+++ b/source/core/sharded-clusters.txt
@@ -125,7 +125,8 @@ balancing process from impacting production traffic.
To disable the balancer, see :ref:`sharding-balancing-disable-temporarily`.
-.. seealso:: :doc:`/tutorial/manage-sharded-cluster-balancer`.
+.. seealso:: :doc:`/tutorial/manage-sharded-cluster-balancer` and
+ :ref:`sharding-internals-balancing`.
.. note::
diff --git a/source/reference/glossary.txt b/source/reference/glossary.txt
index 03927b9a59a..eff75415112 100644
--- a/source/reference/glossary.txt
+++ b/source/reference/glossary.txt
@@ -2,184 +2,81 @@
Glossary
========
+.. NOTE: Several TODO's in this document.
+
.. default-domain:: mongodb
.. glossary::
:sorted:
- BSON
- A serialization format used to store documents and make remote
- procedure calls in MongoDB. "BSON" is a portmanteau of the words
- "binary" and "JSON". Think of BSON as a binary representation
- of JSON (JavaScript Object Notation) documents. For a detailed spec,
- see `bsonspec.org `_.
-
- .. seealso:: The :ref:`bson-json-type-conversion-fidelity`
- section.
-
- database command
- Any MongoDB operation other than an insert, update, remove,
- or query. MongoDB exposes commands as queries
- against the special :term:`$cmd` collection. For
- example, the implementation of :dbcommand:`count` for MongoDB is
- a command.
-
- .. seealso:: :doc:`/reference/command` for a full list of
- database commands in MongoDB
-
- operator
- A keyword beginning with a ``$`` used to express a complex
- query, update, or data transformation. For example, ``$gt``
- is the query language's "greater than" operator.
- See the :doc:`/reference/operator` for more
- information about the available operators.
-
- MongoDB
- The document-based database server described in this manual.
-
- document
- A record in a MongoDB collection, and the basic unit of data
- in MongoDB. Documents are analogous to JSON objects, but exist in
- the database in a more type-rich format known as :term:`BSON`.
-
- field
- A name-value pair in a :term:`document `. Documents have zero
- or more fields. Fields are analogous to columns in relational
- databases.
-
- database
- A physical container for :term:`collections `.
- Each database gets its own set of files on the file
- system. A single MongoDB server typically servers multiple
- databases.
-
- collection
- Collections are groupings of :term:`BSON` :term:`documents
- `. Collections do not enforce a schema, but they are
- otherwise mostly analogous to :term:`RDBMS` tables.
-
- The documents within a collection may not need the exact same
- set of fields, but typically all documents in a collection have
- a similar or related purpose for an application.
-
- All collections exist within a single :term:`database`. The
- namespace within a database for collections are flat.
-
- See :ref:`faq-dev-namespace` and :doc:`/core/document` for more
- information.
-
$cmd
- A virtual :term:`collection` that exposes :term:`MongoDB`'s
- :term:`database commands `.
-
- JSON
- JavaScript Object Notation. A human-readable, plain text format
- for expressing structured data with support in many programming
- languages.
-
- JSON document
- A :term:`JSON` document is a collection of fields and values in a
- structured format. The following is a sample :term:`JSON
- document` with two fields:
+ A special virtual :term:`collection` that exposes MongoDB's
+ :term:`database commands `. Commands in MongoDB
+ are implemented as queries performed on the ``$cmd`` collection.
+ To use database commands, see :ref:`issue-commands`.
- .. code-block:: javascript
+ _id
+ A field required in every MongoDB :term:`document`. The ``_id``
+ field must have a unique value for a given document. You can think
+ of the ``_id`` field as the document's :term:`primary key`. The
+ value of ``_id`` can be any :term:`BSON` data type. If you create
+ a new document without an ``_id`` field, MongoDB automatically
+ creates the field and assigns it the value of a BSON
+ :term:`ObjectId`.
+
+ .. TODO When the CRUD docs are migrated, add to the above entry a link
+ to the "The _id Field" section in /source/core/document.txt.
+ Also, add the link anchor to "The _id Field."
- { name: "MongoDB",
- type: "database" }
+ accumulator
+ An :term:`expression` in the :term:`aggregation framework` that
+ maintains state between documents in the aggregation
+ :term:`pipeline`. For a list of accumulator operations, see
+ :pipeline:`$group`.
admin database
- A privileged database named ``admin``. Users must have access
- to this database to run certain administrative commands.
- See :ref:`administrative commands `
- for more information
- and :ref:`admin-commands` for a list of these commands.
-
- replica set
- A cluster of MongoDB servers that implements master-slave
- replication and automated failover. MongoDB's recommended
- replication strategy.
-
- .. seealso:: :doc:`/replication`.
-
- replication
- A feature allowing multiple database servers to share the same
- data, thereby ensuring redundancy and facilitating load balancing.
- MongoDB supports two flavors of replication: master-slave replication
- and replica sets.
-
- .. seealso:: :term:`replica set`, :term:`sharding`,
- :doc:`/replication`.
-
- shard
- A single :program:`mongod` instance or a :term:`replica set`
- that stores some portion of a sharded cluster's
- total data set. In production, all shards should be replica sets.
- See :term:`sharding`.
-
- .. seealso:: The documents in the :doc:`/sharding` section of this manual.
-
- sharding
- A database architecture that enable horizontal scaling by splitting
- data into key ranges among two or more replica sets. This architecture
- is also known as "range-based partitioning." See :term:`shard`.
-
- .. seealso:: The documents in the :doc:`/sharding` section of this manual.
-
- sharded cluster
- The set of nodes comprising a :term:`sharded ` MongoDB deployment. A sharded cluster
- consists of three config processes, one or more replica sets, and one or more
- :program:`mongos` routing processes.
-
- .. seealso:: The documents in the :doc:`/sharding` section of this manual.
-
- partition
- A distributed system architecture that splits data into ranges.
- :term:`Sharding` is a kind of partitioning.
-
- split
- The division between :term:`chunks ` in a :term:`sharded
- cluster`.
-
- mongod
- The program implementing the MongoDB database server. This server
- typically runs as a :term:`daemon`.
-
- .. seealso:: :doc:`/reference/program/mongod`.
-
- mongos
- The routing and load balancing process that
- acts an interface between an application and
- a MongoDB :term:`sharded cluster`.
+ A privileged database that gives access to certain commands. Users
+ must have access to the ``admin`` database to run certain
+ administrative commands. For a list of administrative commands,
+ see :ref:`admin-commands`.
- .. seealso:: :doc:`/reference/program/mongos`.
+ aggregation
+ A function that reduces and summarizes large sets of data. SQL's
+ ``GROUP`` and MongoDB's map-reduce are two examples of aggregation
+ functions. For more information, see :doc:`/core/aggregation`.
- mongo
- The MongoDB Shell. ``mongo`` connects to :program:`mongod`
- and :program:`mongos` instances, allowing administration,
- management, and testing. :program:`mongo` has a JavaScript
- interface.
+ aggregation framework
+ The set of MongoDB operators that let you calculate aggregate
+ values without having to use :term:`map-reduce`. For a list of
+ operators, see :doc:`/reference/aggregation`.
- .. seealso:: :doc:`/reference/program/mongo` and :doc:`/reference/method`.
+ arbiter
+ A member of a :term:`replica set` that exists solely to vote in
+ :term:`elections `. Arbiters do not replicate data. See
+ :ref:`replica-set-arbiter-configuration`
- cluster
- A set of :program:`mongod` instances running in
- conjunction to increase database availability and
- performance. See :term:`sharding` and :term:`replication` for
- more information on the two different approaches to clustering with
- MongoDB.
+ balancer
+ An internal MongoDB process that runs in the context of a
+ :term:`sharded cluster` and manages the migration of :term:`chunks
+ `. Administrators must disable the balancer for all
+ maintenance operations on a sharded cluster. See
+ :ref:`sharding-balancing`.
- capped collection
- A fixed-sized :term:`collection `. Once they reach
- their fixed size, capped collections automatically overwrite
- their oldest entries. MongoDB's :term:`oplog` replication mechanism depends on
- capped collections. Developers may also use capped collections in their
- applications.
+ BSON
+ A serialization format used to store documents and make remote
+ procedure calls in MongoDB. "BSON" is a portmanteau of the words
+ "binary" and "JSON". Think of BSON as a binary representation
+ of JSON (JavaScript Object Notation) documents.
- .. seealso:: The :doc:`/core/capped-collections` page.
+ For a detailed spec, see `bsonspec.org `_.
+ See also :ref:`bson-json-type-conversion-fidelity`.
BSON types
The set of types supported by the :term:`BSON` serialization
- format. The following types are available:
+ format.
+
+ .. TODO When the CRUD docs are migrated, remove the table below and
+ provide a link to /reference/bson-types.
======================= ==========
**Type** **Number**
@@ -204,239 +101,381 @@ Glossary
Max key 127
======================= ==========
- master
- In conventional master/:term:`slave` replication, the master
- database receives all writes. The :term:`slave` instances
- replicate from the master instance in real time.
+ B-tree
+ A data structure commonly used by database management systems to
+ store indexes. MongoDB uses B-trees for its indexes.
- slave
- In conventional :term:`master`/slave replication, slaves
- are read-only instances that replicate operations from the
- :term:`master` database. Data read from slave instances may
- not be completely consistent with the master. Therefore,
- applications requiring consistent reads must read from the
- master database instance.
+ capped collection
+ A fixed-sized :term:`collection ` that automatically
+ overwrites its oldest entries when it reaches its maximum size.
+ The MongoDB :term:`oplog` that is used in :term:`replication` is a
+ capped collection. See :doc:`/core/capped-collections`.
- primary
- In a :term:`replica set`, the primary member is the current
- :term:`master` instance, which receives all write operations.
+ checksum
+ A calculated value used to ensure data integrity.
+ The :term:`md5` algorithm is sometimes used as a checksum.
- secondary
- In a :term:`replica set`, the ``secondary`` members are the current
- :term:`slave` instances that replicate the contents of the
- master database. Secondary members may handle read requests, but only the
- :term:`primary` members can handle write operations.
+ chunk
+ A partition within a :term:`shard`. A chunk is defined by a
+ contiguous range of :term:`shard key` values that are a subset of
+ the shard's range. Chunk ranges are inclusive of the lower
+ boundary and exclusive of the upper boundary. MongoDB splits
+ chunks when they grow beyond the configured chunk size, which by
+ default is 64 megabytes. MongoDB migrates chunks when a shard
+ contains too many chunks relative to other shards. See
+ :doc:`/core/sharding-chunk-splitting` and
+ :ref:`sharding-chunk-migration`.
- GridFS
- A convention for storing large files in a MongoDB database. All
- of the official MongoDB drivers support this convention, as
- does the ``mongofiles`` program.
+ client
+ The application layer that uses a database for data persistence
+ and storage. :term:`Drivers ` provide the interface
+ level between the application layer and the database server.
- .. seealso:: :doc:`/reference/program/mongofiles` and :doc:`/core/gridfs`.
+ cluster
+ See :term:`sharded cluster`.
- md5
- ``md5`` is a hashing algorithm used to efficiently provide
- reproducible unique strings to identify and :term:`checksum`
- data. MongoDB uses md5 to identify chunks of data for
- :term:`GridFS`.
+ collection
+ A grouping of MongoDB :term:`documents `. A collection
+ is the equivalent of an :term:`RDBMS` table. Collections do not
+ enforce a schema. Documents within a collection can have different
+ fields. Typically, all documents in a collection have a similar or
+ related purpose.
- shell helper
- A number of :doc:`database commands ` have "helper"
- methods in the ``mongo`` shell that provide a more concise
- syntax and improve the general interactive experience.
+ A collection exists within a single :term:`database`. The
+ namespace for a collection is flat. See :ref:`faq-dev-namespace`.
- .. seealso:: :doc:`/reference/program/mongo` and
- :doc:`/reference/method`.
+ compound index
+ An :term:`index` consisting of two or more keys. See
+ :ref:`index-type-compound`.
- write-lock
- A lock on the database for a given writer. When a process
- writes to the database, it takes an exclusive write-lock to
- prevent other processes from writing or reading.
+ config database
+ A :program:`mongod` instance that stores all the metadata
+ associated with a :term:`sharded cluster`. A production sharded
+ cluster requires three config servers, each on a separate machine.
+ See :ref:`sharding-config-server`.
- index
- A data structure that optimizes queries. See
- :doc:`/core/indexes` for more information.
+ control script
+ A simple shell script, typically located in the ``/etc/rc.d`` or
+ ``/etc/init.d`` directory, and used by the system's initialization
+ process to start, restart or stop a :term:`daemon` process.
- secondary index
- A database :term:`index` that improves query performance by
- minimizing the amount of work that the query engine must perform
- to fulfill a query.
+ CRUD
+ An acronym for the fundamental operations of a database: Create,
+ Read, Update, and Delete. To perform these operations, see
+ :doc:`/crud`.
- compound index
- An :term:`index` consisting of two or more keys. See
- :doc:`/core/indexes` for more information.
+ CSV
+ A text-based data format consisting of comma-separated values.
+ This format is commonly used to exchange data between relational
+ databases as the format is well-suited to tabular data. You can
+ import CSV files using :program:`mongoimport`.
- btree
- A data structure used by most database management systems
- to store indexes. MongoDB uses b-trees for its indexes.
+ cursor
+ A pointer to the result set of a :term:`query`. Clients can
+ iterate through a cursor to retrieve results. By default, cursors
+ timeout after 10 minutes of inactivity. See
+ :ref:`read-operations-cursors`.
- ISODate
- The international date format used by :program:`mongo`
- to display dates. E.g. ``YYYY-MM-DD HH:MM.SS.milis``.
+ daemon
+ The conventional name for a background, non-interactive
+ process.
- journal
- A sequential, binary transaction log used to bring the database into
- a consistent state in the event of a hard shutdown. MongoDB
- enables journaling by default for 64-bit builds of MongoDB
- version 2.0 and newer. Journal files are pre-allocated and will
- exist as three 1GB files in the data directory. To make journal
- files smaller, use :setting:`smallfiles`.
+ data-center awareness
+ A property that allows clients to address members in a system
+ based on their locations. :term:`Replica sets `
+ implement data-center awareness using :term:`tagging `. See
+ :doc:`/data-center-awareness`.
- When enabled, MongoDB writes data first to the journal and then
- to the core data files. MongoDB commits to the journal within
- 100ms, which is configurable using the
- :setting:`journalCommitInterval` runtime option.
+ database
+ A physical container for :term:`collections `.
+ Each database gets its own set of files on the file
+ system. A single MongoDB server typically servers multiple
+ databases.
- .. include:: /includes/fact-journal-commit-interval-with-gle.rst
+ database command
+ A MongoDB operation, other than an insert, update, remove, or
+ query. MongoDB exposes commands as queries against the special
+ :term:`$cmd` collection. For a list of database commands, see
+ :doc:`/reference/command`.
- .. seealso:: The :doc:`/core/journaling/` page.
+ database profiler
+ A tool that, when enabled, keeps a record on all long-running
+ operations in a database's ``system.profile`` collection. The
+ profiler is most often used to diagnose slow queries. See
+ :ref:`database-profiling`.
- pcap
- A packet capture format used by :program:`mongosniff` to record
- packets captured from network interfaces and display them as
- human-readable MongoDB operations.
+ datum
+ A set of values used to define measurements on the earth. MongoDB
+ uses the :term:`WGS84` datum in certain :term:`geospatial`
+ calculations. See :doc:`/applications/geospatial-indexes`.
- upsert
- A kind of update that either updates the first document matched
- in the provided query selector or, if no document matches,
- inserts a new document having the fields implied by the
- query selector and the update operation.
+ dbpath
+ The location of MongoDB's data file storage. The
+ default :setting:`dbpath` is ``/data/db``. Other common data
+ paths include ``/srv/mongodb`` and ``/var/lib/mongodb``.
+ See :setting:`dbpath`.
- CSV
- A text-based data format consisting of comma-separated values.
- This format is commonly used to exchange database between relational
- databases, since the format is well-suited to tabular data. You can
- import CSV files using :program:`mongoimport`.
+ delayed member
+ A :term:`replica set` member that cannot become primary and
+ applies operations at a specified delay. The delay is useful for
+ protecting data from human error (i.e. unintentionally deleted
+ databases) or updates that have unforeseen effects on the
+ production database. See :ref:`replica-set-delayed-members`.
- TSV
- A text-based data format consisting of tab-separated values.
- This format is commonly used to exchange database between relational
- databases, since the format is well-suited to tabular data. You can
- import TSV files using :program:`mongoimport`.
+ diagnostic log
+ A verbose log of operations stored in the :term:`dbpath` and named
+ :file:`diaglog.