From 2eac86c12b1c53852bd40fc451e2429802ebdf33 Mon Sep 17 00:00:00 2001
From: Sam Kleinman <samk@10gen.com>
Date: Wed, 18 Jul 2012 20:32:35 -0400
Subject: [PATCH 1/3] DOCS-330: editing and resolving todos raised by technical
 review

---
 draft/administration/indexes.txt              | 214 +++++++-----------
 draft/applications/indexes.txt                | 139 +++++++-----
 draft/core/geospatial-indexes.txt             |   1 +
 draft/core/indexes.txt                        |  75 +++---
 .../note-build-indexes-on-replica-sets.rst    |   4 +
 5 files changed, 216 insertions(+), 217 deletions(-)
 create mode 100644 source/includes/note-build-indexes-on-replica-sets.rst

diff --git a/draft/administration/indexes.txt b/draft/administration/indexes.txt
index dc49536795a..8cdb8c32df4 100644
--- a/draft/administration/indexes.txt
+++ b/draft/administration/indexes.txt
@@ -38,9 +38,7 @@ of the ``people`` collection:
 
 .. code-block:: javascript
 
-   db.people.ensureIndex( { phone-number: 1 } )
-
-TODO: you need ""s around phone-number, otherwise it's invalid JS (phone minus number).
+   db.people.ensureIndex( { "phone-number": 1 } )
 
 To create a :ref:`compound index <index-type-compound>`, use an
 operation that resembles the following prototype:
@@ -57,21 +55,18 @@ collection:
 
    db.products.ensureIndex( { item: 1, category: 1, price: 1 } )
 
-.. note::
-
-   To build indexes for a :term:`replica set`, before version 2.2,
-   see :ref:`index-building-replica-sets`.
+Some drivers may specify indexes, using ``NumberLong(1)`` rather than
+``1`` as the specification. This does not have any affect on the
+resulting index.
 
-TODO: I don't think anything changed about replica set index builds for 2.2...
+.. include:: /includes/note-build-indexes-on-replica-sets.rst
 
 .. [#ensure] As the name suggests, :func:`ensureIndex() <db.collection.ensureIndex()>`
    only creates an index if an index of the same specification does
    not already exist.
 
-Sparse
-``````
-
-TODO: Sparse?  Maybe "Types of Indexes->Sparse"?
+Sparse Indexes
+``````````````
 
 To create a :ref:`sparse index <index-type-sparse>` on a field, use an
 operation that resembles the following prototype:
@@ -91,16 +86,13 @@ without the ``twitter_name`` field.
 
 .. note::
 
-   MongoDB cannot create sparse compound indexes.
+   Sparse indexes can affect the results returned by the query,
+   particularly with respect to sorts on fields *not* included in the
+   index. See the :ref:`sparse index <index-type-sparse>` section for
+   more information.
 
-TODO: is this true? I thought that it could.
-
-TODO: Is there more doc on spare indexes somewhere?  Seems like this is missing
-some info like getting different results back when the index is used, null
-counts as existing, etc.
-
-Unique
-``````
+Unique Indexes
+``````````````
 
 To create a :ref:`unique indexes <index-type-unique>`, consider the
 following prototype:
@@ -109,21 +101,17 @@ following prototype:
 
    db.collection.ensureIndex( { a: 1 }, { unique: true } )
 
-For example, you may want to create a unique index on the ``tax-id:``
+For example, you may want to create a unique index on the ``"tax-id":``
 of the ``accounts`` collection to prevent storing multiple account
 records for the same legal entity:
 
 .. code-block:: javascript
 
-   db.accounts.ensureIndex( { tax-id: 1 }, { unique: true } )
-
-TODO: tax-id should be in ""s.
+   db.accounts.ensureIndex( { "tax-id": 1 }, { unique: true } )
 
 The :ref:`_id index <index-type-primary>` is a unique index. In some
-situations you may want to use the ``_id`` field for these primary
-data rather than using a unique index on another field.
-
-TODO: "for these primary data"?
+situations you may consider using ``_id`` field itself for this kind
+of data rather than using a unique index on another field.
 
 In many situations you will want to combine the ``unique`` constraint
 with the ``sparse`` option. When MongoDB indexes a field, if a
@@ -155,11 +143,9 @@ as in the following example:
 
 .. code-block:: javascript
 
-   db.accounts.dropIndex( { tax-id: 1 } )
+   db.accounts.dropIndex( { "tax-id": 1 } )
 
-TODO: ""s!
-
-This will remove the index on the ``tax-id`` field in the ``accounts``
+This will remove the index on the ``"tax-id"`` field in the ``accounts``
 collection. The shell provides the following document after completing
 the operation:
 
@@ -216,16 +202,7 @@ This shell helper provides a wrapper around the :dbcommand:`reIndex`
 </applications/drivers>` may have a different or additional interface
 for this operation.
 
-.. note::
-
-   To rebuild indexes for a :term:`replica set`, before version 2.2,
-   see :ref:`index-rebuilding-replica-sets`.
-
-TODO: again, this probably isn't different in 2.2
-
-TODO: one thing that I would appreciate you mentioning is that some drivers may
-create indexes like {a : NumberLong(1)} _which is fine_ and doesn't break
-anything so stop complaining about it.
+.. include:: /includes/note-build-indexes-on-replica-sets.rst
 
 Special Creation Options
 ~~~~~~~~~~~~~~~~~~~~~~~~
@@ -235,7 +212,8 @@ Special Creation Options
    TTL collections use a special ``expire`` index option. See
    :doc:`/tutorial/expire-data` for more information.
 
-TODO: Are 2d indexes getting a mention?
+.. TODO: insert link here to the geospatial index documents when
+   they're published.
 
 Background
 ``````````
@@ -248,26 +226,17 @@ prototype invocation of :func:`db.collection.ensureIndex()`:
 
    db.collection.ensureIndex( { a: 1 }, { background: true } )
 
-TODO: what does it mean to build an index in the background? You might want to
-mention:
-* performance implications
-* that this type of index build can be killed
-* that this blocks the connection you sent the ensureindex on, but ops from
-  other connections can proceed in
-* that indexes are created on the foreground on secondaries in 2.0,
-  which blocks replication & slave reads.  In 2.2, it does not block reads (but
-  still blocks repl).
+Consider the section on :ref:`background index construction
+<index-creation-background>` for more information about these indexes
+and their implications.
 
 Drop Duplicates
 ```````````````
 
 To force the creation of a :ref:`unique index <index-type-unique>`
-index
-
-TODO: " on a collection with duplicate values in the field to be indexed "
-
-you can use the ``dropDups`` option. This will force MongoDB to
-create a *unique* index by deleting documents with duplicate values
+index on a collection with duplicate values in the field you are
+indexing you can use the ``dropDups`` option. This will force MongoDB
+to create a *unique* index by deleting documents with duplicate values
 when building the index. Consider the following prototype invocation
 of :func:`db.collection.ensureIndex()`:
 
@@ -280,82 +249,65 @@ See the full documentation of :ref:`duplicate dropping
 
 .. warning::
 
-   Specifying ``{ dropDups: true }`` will delete data from your
+   Specifying ``{ dropDups: true }`` may delete data from your
    database. Use with extreme caution.
 
-TODO: I'd say it "may" delete data from your DB, not like it's going to go all
-Shermanesque on your data.
-
 .. _index-building-replica-sets:
 
 Building Indexes on Replica Sets
 --------------------------------
 
-.. versionchanged:: 2.2
-   Index rebuilding operations on :term:`secondary` members of
-   :term:`replica sets <replica set>` now run as normal background
-   index operations. Run :func:`ensureIndex()
-   <db.collection.ensureIndex()>` normally with the ``{ background:
-   true }`` option for replica sets. Alternatively, you may always use
-   the following operation to isolate and control the impact of
-   indexing building operations on a set as a whole.
-
-TODO: I think there needs to be a huge mention that this still blocks
-replication, so the procedure below is recommended.
-
-.. admonition:: For Version 1.8 and 2.0
-
-   :ref:`Background index creation operations
-   <index-creation-background>` became *foreground* indexing
-   operations on :term:`secondary` members of replica sets. These
-   foreground operations will block all replication on the
-   secondaries,
+Consideration
+~~~~~~~~~~~~~
 
-TODO: and don't allow any reads to go through.
+:ref:`Background index creation operations
+<index-creation-background>` became *foreground* indexing operations
+on :term:`secondary` members of replica sets. These foreground
+operations will block all replication on the secondaries, and don't
+allow any reads. As a result in most cases use the following procedure
+to build indexes on secondaries.
 
-   and can impact performance of the entire set. To build
-   indexes with minimal impact on a replica set, use the following
-   procedure for all non-trivial index builds:
+Procedure
+~~~~~~~~~
 
-   #. Stop the :program:`mongod` process on one secondary. Restart the
-      :program:`mongod` process *without* the :option:`--replSet <mongod --replSet>`
-      option. This instance is now in "standalone" mode.
+#. Stop the :program:`mongod` process on one secondary. Restart the
+   :program:`mongod` process *without* the :option:`--replSet <mongod --replSet>`
+   option and running on a different port. [#different-port]_ This
+   instance is now in "standalone" mode.
 
-TODO: generally we recommend running it on a different port, too, so that apps
-& other servers in the set don't try to contact it.
+#. Create the new index or rebuild the index on this :program:`mongod`
+   instance.
 
-   #. Create the new index or rebuild the index on this :program:`mongod`
-      instance.
+#. Restart the :program:`mongod` instance with the
+   :option:`--replSet <mongod --replSet>` option. Allow replication
+   to catch up on this member.
 
-   #. Restart the :program:`mongod` instance with the
-      :option:`--replSet <mongod --replSet>` option. Allow replication
-      to catch up on this member.
+#. Replete this operation on all of the remaining secondaries.
 
-   #. Replete this operation on all of the remaining secondaries.
+#. Run :func:`rs.stepDown()` on the :term:`primary` member of the
+   set, and then repeat this procedure on the former primary.
 
-   #. Run :func:`rs.stepDown()` on the :term:`primary` member of the
-      set, and then run this procedure on the former primary.
-
-   .. warning::
+.. warning::
 
-      Ensure that your :ref:`oplog` is large enough to permit the
-      indexing or re-indexing operation to complete without falling
-      too far behind to catch up. See the ":ref:`replica-set-oplog-sizing`"
-      documentation for additional information.
+   Ensure that your :ref:`oplog` is large enough to permit the
+   indexing or re-indexing operation to complete without falling
+   too far behind to catch up. See the ":ref:`replica-set-oplog-sizing`"
+   documentation for additional information.
 
-   .. note::
+.. note::
 
-      This procedure *does* block indexing on one member of the
-      replica set at a time. However, the foreground indexing
-      operation is more efficient than the background index operation,
-      and will only affect one secondary at a time rather than *all*
-      secondaries at the same time.
+   This procedure *does* block indexing on one member of the
+   replica set at a time. However, the foreground indexing
+   operation is more efficient than the background index operation,
+   and will only affect one secondary at a time rather than *all*
+   secondaries at the same time.
 
-For the best results, always create indexes *before* you begin
-inserting data into a collection.
+.. [#different-port] By running the :program:`mongod` on a different
+   port, you ensure that the other members of the replica set and all
+   clients will not contact the member while you are building the
+   index.
 
-TODO: well, sort of.  That'll build the indexes fast, but make the inserts
-slower.  Overall, it's faster to insert data, then build indexes.
+.. _indexes-measuring-use:
 
 Measuring Index Use
 -------------------
@@ -374,12 +326,7 @@ following tools:
 - :func:`cursor.hint()`
 
   Append the :func:`hint() <cursor.hint()>` to any cursor (e.g.
-  query) with the name
-
-TODO: this isn't "the name of an index."  I'd say just "with the index."  The
-name of an index is a string like "zipcode_1".
-
-  of an index as the argument to *force* MongoDB
+  query) with the index as the argument to *force* MongoDB
   to use a specific index to fulfill the query. Consider the following
   example:
 
@@ -387,12 +334,15 @@ name of an index is a string like "zipcode_1".
 
      db.people.find( { name: "John Doe", zipcode: { $gt: 63000 } } } ).hint( { zipcode: 1 } )
 
-
   You can use :func:`hint() <cursor.hint()>` and :func:`explain()
   <cursor.explain()>` in conjunction with each other to compare the
-  effectiveness of a specific index.
+  effectiveness of a specific index. Specify the ``$natural`` operator
+  to the :func:`hint() <cursor.hint()>` method to prevent MongoDB from
+  using *any* index:
+
+  .. code-block:: javascript
 
-TODO: mention $natural to force no index usage?
+     db.people.find( { name: "John Doe", zipcode: { $gt: 63000 } } } ).hint( { $natural: 1 } )
 
 - :status:`indexCounters`
 
@@ -400,5 +350,17 @@ TODO: mention $natural to force no index usage?
   :dbcommand:`serverStatus` for insight into database-wise index
   utilization.
 
-TODO: I'd like to see this also cover how to track how far an index build has
-gotten and how to kill an index build.
+Monitoring and Controlling Index Building
+-----------------------------------------
+
+.. TODO insert links to the values in the inprog array following the
+   completion of DOCS-162
+
+To see the status of the indexing processes, you can use the
+:func:`db.currentOP()` method in the :program:`mongo` shell. The value
+of the ``query`` field and the ``msg`` field will indicate if the
+operation is an index build. The ``msg`` field also indicates the
+percent of the build that is complete.
+
+If you need to terminate an ongoing index build, You can use the
+:func:`db.killOP()` method in the :program:`mongo` shell.
diff --git a/draft/applications/indexes.txt b/draft/applications/indexes.txt
index f7ca3c60165..108660782c0 100644
--- a/draft/applications/indexes.txt
+++ b/draft/applications/indexes.txt
@@ -21,6 +21,9 @@ applications with MongoDB.
 Strategies
 ----------
 
+.. _covered-queries:
+.. _indexes-covered-queries:
+
 Use Covered Queries
 ~~~~~~~~~~~~~~~~~~~
 
@@ -30,16 +33,15 @@ database. To use a covered index you must:
 
 - ensure that the index includes all of the fields in the result.
 
+  This means that the :term:`projection`, must explicitly exclude the
+  ``_id`` field from the result set, unless the index includes
+  ``_id``.
+
 - if any of the indexed fields in any of the documents in the
   collection includes an array, then the index becomes a
   :ref:`multi-key index <index-type-multi-key>` index, and cannot
   support a covered query.
 
-- in the :term:`projection`, explicitly exclude the ``_id`` field from
-  the result set, unless the index includes ``_id``.
-
-TODO: the third point seems like part of the first point.
-
 Use the :func:`explain() <cursor.explain()>` to test the query. If
 MongoDB was able to use a covered index, then the value of the
 ``indexOnly`` field will be ``true``.
@@ -51,14 +53,8 @@ disk, and indexes are smaller than the documents they catalog.
 Sort Using Indexes
 ~~~~~~~~~~~~~~~~~~
 
-While the :dbcommand:`sort` database command
-
-TODO: sort database command?  Is "database command" being used in a different
-sense here?
-
- and the :func:`sort()
-<cursor.sort()>` helper support in-memory sort operations without the
-use of an index, these operations are:
+While the :func:`sort() <cursor.sort()>` method supports in-memory
+sort operations without the use of an index, these operations are:
 
 #. Significantly slower than sort operations that use an index.
 
@@ -84,8 +80,10 @@ results. For example:
 When using compound indexes to support sort operations, the sorted
 field must be the *last* field in the index.
 
-TODO: not true!  In 2.2, you can use, say, the index above for a query on
-username, sort by status, too.
+.. TODO: not true!  In 2.2, you can use, say, the index above for a query on
+   username, sort by status, too.
+
+.. is this not true in other version? what changed?
 
 Store Indexes in Memory
 ~~~~~~~~~~~~~~~~~~~~~~~
@@ -132,9 +130,9 @@ deep understanding of:
 
 - which indexes the most common queries use.
 
-MongoDB can only use *one* index to support any given operation.
-
-TODO: trickily put.  I hope you menion $or elsewhere?
+MongoDB can only use *one* index to support any given
+operation. However, each clause of an :operator:`$or` query can use
+its own index.
 
 Selectivity
 ~~~~~~~~~~~
@@ -142,48 +140,76 @@ Selectivity
 Selectivity describes the ability of a query to narrow the result set
 using the index. Effective indexes are more selective and allow
 MongoDB to use the index for a larger portion of the work associated
-with fulfilling the query.
+with fulfilling the query. There are two aspects of selectivity:
 
-.. example::
+#. Data need to have a high distribution of the values for the indexed
+   key.
 
-   First, consider an index on a field that has three values evenly
-   distributed across the collection. If MongoDB uses this index for a
-   query, MongoDB will still need to scan a third of the
-   :term:`documents <document>` in the collection to fulfill the rest
-   of the query.
+#. Queries need to limit the number of possible documents using the
+   indexed field.
 
-   Then, consider an index on a field that has many values evenly
-   distributed across the collection. If your query selects one of
-   these values using the index, MongoDB will only need to scan a very
-   small number of documents to fulfill the rest of the query.
+.. example::
 
-TODO: It'd be clearer to use "real" numbers in the second example, too, but I
-think you'd have to re-jigger the example to do so.
+   First, consider an index, ``{ a : 1 }``, on a collection where
+   ``a`` has three values evenly distributed across the collection:
 
-To ensure optimal performance, use indexes that are maximally
-selective relative to your queries.
+   .. code-block:: javascript
 
-TODO: the example makes selectivity sound like the uniqueness of the index,
-which isn't the whole story.  Having something like {x:{$gt:3}} that matches 60%
-of the collection isn't very selective, even if x has a unique index on it.
+      { _id: ObjectId(), a: 1, b: "ab" }
+      { _id: ObjectId(), a: 1, b: "cd" }
+      { _id: ObjectId(), a: 1, b: "ef" }
+      { _id: ObjectId(), a: 2, b: "jk" }
+      { _id: ObjectId(), a: 2, b: "lm" }
+      { _id: ObjectId(), a: 2, b: "no" }
+      { _id: ObjectId(), a: 3, b: "pq" }
+      { _id: ObjectId(), a: 3, b: "rs" }
+      { _id: ObjectId(), a: 3, b: "tv" }
 
-I think it's important to emphasize that selectivity is whittling down possible
-results to as small a % as possible.
+   If you do a query for ``{ a: 2, b: "no" }`` MongoDB will still need
+   to scan 3 documents of the :term:`documents <document>` in the
+   collection to fulfill the query. Similarly, a query for ``{ a: {
+   $gt: 1}, b: "tv" }``, would need to scan through 6 documents,
+   although both queries would return the same result.
 
-TODO: Also, might be worth mentioning that, if you cannot get selectivity low
-enough, indexes will actually be slower than table scans.
+   Then, consider an index on a field that has many values evenly
+   distributed across the collection:
+
+   .. code-block:: javascript
+
+      { _id: ObjectId(), a: 1, b: "ab" }
+      { _id: ObjectId(), a: 2, b: "cd" }
+      { _id: ObjectId(), a: 3, b: "ef" }
+      { _id: ObjectId(), a: 4, b: "jk" }
+      { _id: ObjectId(), a: 5, b: "lm" }
+      { _id: ObjectId(), a: 6, b: "no" }
+      { _id: ObjectId(), a: 7, b: "pq" }
+      { _id: ObjectId(), a: 8, b: "rs" }
+      { _id: ObjectId(), a: 9, b: "tv" }
+
+   Although the index on ``a`` is more selective, in the sense that
+   queries can use the index more effectively, a query such as ``{ a:
+   { $gt: 5 }, b: "tv" }`` would still need to scan 4 documents. By
+   contrast, given a query like ``{ a: 2, b: "cd" }``, MongoDB would
+   only need to scan one document to fulfill the rest of the
+   query. The index and query are more selective because the values of
+   ``a`` are evenly distributed *and* the query can selects a specific
+   document using the index.
+
+To ensure optimal performance, use indexes that are maximally
+selective relative to your queries. At the same time queries need to
+be appropriately selective relative to your indexed data. If overall
+selectivity is low enough, and MongoDB must read a number of documents
+to return results, then some queries may perform faster without
+indexes. See the :ref:`indexes-measuring-use` section for more
+information on testing information.
 
 Insert Throughput
 ~~~~~~~~~~~~~~~~~
 
 .. TODO insert link to /source/core/write-operations when that page is complete.
 
-.. TODO fact check
-
-MongoDB must update all indexes associated with a collection following
-every insert or update operation.
-
-TODO: or delete, too
+MongoDB must update all indexes associated with a collection after
+every insert, update, or delete operation.
 
 Every index on a collection adds
 some amount of overhead to these operations. In almost every case, the
@@ -191,9 +217,7 @@ performance gains that indexes realize for read operations are worth
 the insertion penalty; however:
 
 - in some cases, an index to support an infrequent query may incur
-  more insert-related costs, than saved read-time.
-
-TODO: rm comma: "insert-related costs than saved read-time"
+  more insert-related costs than saved read-time.
 
 - in some situations, if you have many indexes on a collection with a
   high insert throughput and a number of very similar indexes, you may
@@ -201,7 +225,9 @@ TODO: rm comma: "insert-related costs than saved read-time"
   on some queries if it means consolidating the total number of
   indexes.
 
-TODO: do you cover what indexes overlap?
+.. TODO: do you cover what indexes overlap?
+
+.. no. I'm not sure the case to which you're referring.
 
 Index Size
 ~~~~~~~~~~
@@ -212,15 +238,12 @@ your queries only match a subset of the documents and can use the
 index to locate those documents, MongoDB can maintain a much smaller
 :term:`working set`. Ensure that:
 
-- all of your indexes use less space than the documents in the
-  collection.
-
-TODO: individually or all together?
-
-- the indexes and a reasonable working set can fit RAM at the same
-  time.
+- the indexes and the working set can fit RAM at the same time.
 
-TODO: a reasonable working set?
+- all of your indexes use less space than all of the documents in the
+  collection. This may not be an issue all of your queries use
+  :ref:`covered queries <covered-queries>` or indexes do not need to
+  fit into ram, as in the following situation:
 
 .. _indexing-right-handed:
 
diff --git a/draft/core/geospatial-indexes.txt b/draft/core/geospatial-indexes.txt
index 824e97ce37f..cfcbf6cbfc9 100644
--- a/draft/core/geospatial-indexes.txt
+++ b/draft/core/geospatial-indexes.txt
@@ -189,6 +189,7 @@ or latitude. Create this index using following command:
    db.places.ensureIndex({ loc: "geoHaystack", type: 1} ,
                          { bucketSize: 2 } )
 
+
 .. TODO clarify what the type argument does or if it's just always
    required.
 
diff --git a/draft/core/indexes.txt b/draft/core/indexes.txt
index 46d017282ea..f8ad7d1d520 100644
--- a/draft/core/indexes.txt
+++ b/draft/core/indexes.txt
@@ -270,6 +270,7 @@ index to locate the document:
 
    db.feedback.find( { "comments.text": "Please expand the olive selection." } )
 
+.. include:: /includes/note-build-indexes-on-replica-sets.rst
 
 .. warning::
 
@@ -394,7 +395,8 @@ construction:
   operations can run while creating the index. However, the
   :program:`mongo` shell session or connection where you are creating
   the index will block until the index build is complete. Open another
-  connection or :program:`mongo` instance to continue using the database.
+  connection or :program:`mongo` instance to continue using commands
+  to the database.
 
 - The background index operation use an incremental approach that is
   slower than the normal "foreground" index builds. If the index is
@@ -403,21 +405,25 @@ construction:
 
 .. admonition:: Building Indexes on Secondaries
 
-   .. versionchanged:: 2.1.0
-      Before 2.1.0, :term:`replica sets <replica set>` cannot build
-      indexes in the background on :term:`secondaries <secondary members>`.
+   Background index operations on a :term:`replica set`
+   :term:`primary`, become foreground indexing operations on secondary
+   members of the set. All indexing operations on secondaries block
+   replication.
 
-   To rebuild large indexes on secondaries before version 2.1.0,
-   typically the best approach is to restart each secondary in
-   "standalone" mode and build the index. When the index is rebuilt,
-   restart as a member of the replica set, allow it to catch up with
-   the other members of the set, and then rebuild the index on the
-   next secondary. When all the secondaries have the new index, step
-   down the primary and build the index on the former primary.
+   To rebuild large indexes on secondaries the best approach is to
+   restart one secondary at a time in "standalone" mode and build the
+   index. When the index is rebuilt, restart as a member of the
+   replica set, allow it to catch up with the other members of the
+   set, and then rebuild the index on the next secondary. When all the
+   secondaries have the new index, step down the primary, restart it
+   as a standalone, and build the index on the former primary.
 
    Remember, the amount of time required to build the index on a
    secondary node must be within the window of the :term:`oplog`, so
-   that the secondary can catch up.
+   that the secondary can catch up with the primary.
+
+   See :ref:`index-building-replica-sets` for more information on
+   this process.
 
   Indexes on secondary members in "recovering" mode are always built
   in the foreground to allow them to catch up as soon as possible.
@@ -503,6 +509,8 @@ indexes to fulfill arbitrary queries.
 
 .. see:: :doc:`/tutorial/expire-data`
 
+.. _index-feature-geospatial:
+
 Geospatial Indexes
 ~~~~~~~~~~~~~~~~~~
 
@@ -515,31 +523,13 @@ are "near" a given coordinate pair.
 To create a geospatial index, your :term:`documents <document>` must
 have a coordinate pair. For maximum compatibility, these coordinate
 pairs should be in the form of a two element array, such as ``[ x , y
-]``, but other representations are acceptable, including:
-
-.. code-block:: javascript
-
-   { loc : [ 50 , 30 ] }
-   { loc : { x : 50 , y : 30 } }
-   { loc : { foo : 50 , y : 30 } }
-   { loc : { lon : 40.739037, lat: 73.992964 } }
-
-Given the field of ``loc`` in the collection ``places``, you would
-create a geospatial index as follows:
+]``. Given the field of ``loc``, that held a coordinate pair, in the
+collection ``places``, you would create a geospatial index as follows:
 
 .. code-block:: javascript
 
    db.places.ensureIndex( { loc : "2d" } )
 
-By default, ``2d`` indexes assume that the coordinates are
-latitude/longitude systems, and assume that minimum and maximum bounds
-are ``[ -180, 180 ]``. You can specify a different minimum and maximum
-values, as follows:
-
-.. code-block:: javascript
-
-   db.places.ensureIndex( { loc : "2d" }, { min: -250 , max: 250 } )
-
 MongoDB will reject documents that have values in the ``loc`` field
 beyond the minimum and maximum values.
 
@@ -556,7 +546,26 @@ data.
 .. TODO insert link to special /core/geospatial.txt documentation
    on this topic. once that document exists.
 
-.. TODO short mention of geoHaystack indexes here?
+Geohaystack Indexes
+~~~~~~~~~~~~~~~~~~~
+
+.. TODO update links in the following session as needed:
+
+In addition to conventional :ref:`geospatial indexes
+<index-feature-geospatial>`, MongoDB also provides a bucket-based
+geospatial index, called "geospatial haystack indexes." These indexes
+support high performance queries for locations within a small area,
+when the query must filter along another dimension.
+
+.. example::
+
+   If you need to return all documents that have coordinates within 25
+   miles of a given point *and* have a type field value of "museum," a
+   haystack index would be provide the best support for these queries.
+
+Haystack indices allow you to tune your bucket size to the
+distribution of your data, so that in general you search only very
+small regions of 2d space for a particular kind of document.
 
 Index Limitations
 -----------------
diff --git a/source/includes/note-build-indexes-on-replica-sets.rst b/source/includes/note-build-indexes-on-replica-sets.rst
new file mode 100644
index 00000000000..9dc493807df
--- /dev/null
+++ b/source/includes/note-build-indexes-on-replica-sets.rst
@@ -0,0 +1,4 @@
+.. note::
+
+   To rebuild indexes for a :term:`replica set` see
+   :ref:`index-rebuilding-replica-sets`.

From 2e6a1b8368686074ff5377756239be3c6c35159e Mon Sep 17 00:00:00 2001
From: Sam Kleinman <samk@10gen.com>
Date: Fri, 20 Jul 2012 14:59:23 -0400
Subject: [PATCH 2/3] DOCS-330 adding examples based on feedback from astaple

---
 draft/applications/indexes.txt | 109 ++++++++++++++++++++++++++++++---
 1 file changed, 99 insertions(+), 10 deletions(-)

diff --git a/draft/applications/indexes.txt b/draft/applications/indexes.txt
index 108660782c0..e2f9da43521 100644
--- a/draft/applications/indexes.txt
+++ b/draft/applications/indexes.txt
@@ -50,6 +50,9 @@ Covered queries are much faster than other queries, for two reasons:
 indexes are typically stored in RAM *or* located sequentially on
 disk, and indexes are smaller than the documents they catalog.
 
+.. _index-sort:
+.. _sorting-with-indexes:
+
 Sort Using Indexes
 ~~~~~~~~~~~~~~~~~~
 
@@ -73,17 +76,55 @@ results. For example:
   on ":ref:`Ascending and Descending Index Order
   <index-ascending-and-descending>`."
 
-- MongoDB can use a compound index ``{ status: 1, username: 1 }`` to
-  return a query on the ``status`` field sorted by the ``username``
-  field.
+- In general, MongoDB can use a compound index to return sorted
+  results *if*:
+
+  - the first sorted field is first field in the index.
+
+  - the last field in the index before the first sorted field is an
+    equality match in the query.
+
+  Consider the example presented below for an illustration of this
+  concept.
+
+.. example::
+
+   Given the following index:
+
+   .. code-block:: javascript
+
+      { a: 1, b: 1, c: 1, d: 1 }
+
+   The following query and sort operations will be able to use the
+   index:
+
+   .. code-block:: javascript
+
+      db.collection.find().sort( { a:1 } )
+      db.collection.find().sort( { a:1, b:1 } )
+
+      db.collection.find( { a:4 } ).sort( { a:1, b:1 } )
+      db.collection.find( { b:5 } ).sort( { a:1, b:1 } )
+
+      db.collection.find( { a:{ $gt:4 } } ).sort( { a:1, b:1 } )
+      db.collection.find( { b:{ $gt:5 } } ).sort( { a:1, b:1 } )
+
+      db.collection.find( { a:5 } ).sort( { a:1, b:1 } )
+      db.collection.find( { a:5 } ).sort( { b:1, c:1 } )
+
+      db.collection.find( { a:5, c:4, b:3 } ).sort( { d:1 } )
 
-When using compound indexes to support sort operations, the sorted
-field must be the *last* field in the index.
+      db.collection.find( { a:5, b:3, d:{ $gt:4 } } ).sort( { c:1 } )
+      db.collection.find( { a:5, b:3, c:{ $lt:2 }, d:{ $gt:4 } } ).sort( { c:1 } )
 
-.. TODO: not true!  In 2.2, you can use, say, the index above for a query on
-   username, sort by status, too.
+   However, the following query operations would not be able to sort
+   data using the index:
 
-.. is this not true in other version? what changed?
+   .. code-block:: javascript
+
+      db.collection.find().sort( { b:1 } )
+      db.collection.find( { b:5 } ).sort( { b:1 } )
+      db.collection.find( { b:{ $gt:5 } } ).sort( { a:1, b:1 } )
 
 Store Indexes in Memory
 ~~~~~~~~~~~~~~~~~~~~~~~
@@ -134,6 +175,8 @@ MongoDB can only use *one* index to support any given
 operation. However, each clause of an :operator:`$or` query can use
 its own index.
 
+.. _index-selectivity:
+
 Selectivity
 ~~~~~~~~~~~
 
@@ -225,9 +268,55 @@ the insertion penalty; however:
   on some queries if it means consolidating the total number of
   indexes.
 
-.. TODO: do you cover what indexes overlap?
+- If your indexes and queries are not very selective, the speed
+  improvements for query operations may not offset the costs of
+  maintaining an index. See the section on :ref:`index selectivity
+  <index-selectivity>` for more information.
+
+- In some cases a single compound on two or more fields index may
+  support all of the queries that index on a single field index, or a
+  smaller compound index. In general, MongoDB can use compound index
+  to support the same queries as any of its prefixes. Consider the
+  following example:
+
+  .. example::
+
+     Given the following index on a collection:
+
+     .. code-block:: javascript
+
+        { x: 1, y: 1, z: 1 }
+
+     Can support a number of queries as well as most of the queries
+     that the following indexes support:
+
+     .. code-block:: javascript
+
+        { x: 1 }
+        { x: 1, y: 1 }
+
+     There are some situations where the prefix indexes may offer
+     better query performance as is the case if ``z`` is a large
+     array. Also, consider the following index on the same collection:
+
+     .. code-block:: javascript
+
+        { x: 1, z: 1 }
+
+     The ``{ x: 1, y: 1, z: 1 }`` index can support many of the same
+     queries as the above index; however, ``{ x: 1, z: 1 }`` has
+     additional use: Given the following query:
+
+     .. code-block:: javascript
+
+        db.collection.find( { x: 5 } ).sort( { z: 1} )
+
+     The ``{ x: 1, z: 1 }`` will support both the query and the sort
+     operation, while the ``{ x: 1, y: 1, z: 1 }`` index can only
+     support the query.
 
-.. no. I'm not sure the case to which you're referring.
+     See the :ref:`sorting-with-indexes` section for more
+     information.
 
 Index Size
 ~~~~~~~~~~

From 1dd4f9a4b68ece6a2026e063a86db831e386cd8f Mon Sep 17 00:00:00 2001
From: Sam Kleinman <samk@10gen.com>
Date: Fri, 20 Jul 2012 16:03:06 -0400
Subject: [PATCH 3/3] DOCS-330 final round of comments on the indexing
 documents

---
 draft/administration/indexes.txt                   | 14 ++++++--------
 draft/applications/indexes.txt                     |  2 +-
 draft/core/indexes.txt                             | 13 ++++++++-----
 .../note-build-indexes-on-replica-sets.rst         |  2 +-
 4 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/draft/administration/indexes.txt b/draft/administration/indexes.txt
index 8cdb8c32df4..cfa9164733a 100644
--- a/draft/administration/indexes.txt
+++ b/draft/administration/indexes.txt
@@ -261,7 +261,7 @@ Consideration
 ~~~~~~~~~~~~~
 
 :ref:`Background index creation operations
-<index-creation-background>` became *foreground* indexing operations
+<index-creation-background>` become *foreground* indexing operations
 on :term:`secondary` members of replica sets. These foreground
 operations will block all replication on the secondaries, and don't
 allow any reads. As a result in most cases use the following procedure
@@ -282,7 +282,7 @@ Procedure
    :option:`--replSet <mongod --replSet>` option. Allow replication
    to catch up on this member.
 
-#. Replete this operation on all of the remaining secondaries.
+#. Repeat this operation on all of the remaining secondaries.
 
 #. Run :func:`rs.stepDown()` on the :term:`primary` member of the
    set, and then repeat this procedure on the former primary.
@@ -296,11 +296,9 @@ Procedure
 
 .. note::
 
-   This procedure *does* block indexing on one member of the
-   replica set at a time. However, the foreground indexing
-   operation is more efficient than the background index operation,
-   and will only affect one secondary at a time rather than *all*
-   secondaries at the same time.
+   This procedure *does* take one member out of the replica set at a
+   time. However, this procedure will only affect one member of the
+   set at a time rather than *all* secondaries at the same time.
 
 .. [#different-port] By running the :program:`mongod` on a different
    port, you ensure that the other members of the replica set and all
@@ -357,7 +355,7 @@ Monitoring and Controlling Index Building
    completion of DOCS-162
 
 To see the status of the indexing processes, you can use the
-:func:`db.currentOP()` method in the :program:`mongo` shell. The value
+:func:`db.currentOp()` method in the :program:`mongo` shell. The value
 of the ``query`` field and the ``msg`` field will indicate if the
 operation is an index build. The ``msg`` field also indicates the
 percent of the build that is complete.
diff --git a/draft/applications/indexes.txt b/draft/applications/indexes.txt
index e2f9da43521..a999343a505 100644
--- a/draft/applications/indexes.txt
+++ b/draft/applications/indexes.txt
@@ -118,7 +118,7 @@ results. For example:
       db.collection.find( { a:5, b:3, c:{ $lt:2 }, d:{ $gt:4 } } ).sort( { c:1 } )
 
    However, the following query operations would not be able to sort
-   data using the index:
+   the results using the index:
 
    .. code-block:: javascript
 
diff --git a/draft/core/indexes.txt b/draft/core/indexes.txt
index f8ad7d1d520..a7269892ab1 100644
--- a/draft/core/indexes.txt
+++ b/draft/core/indexes.txt
@@ -410,11 +410,11 @@ construction:
    members of the set. All indexing operations on secondaries block
    replication.
 
-   To rebuild large indexes on secondaries the best approach is to
+   To build large indexes on secondaries the best approach is to
    restart one secondary at a time in "standalone" mode and build the
-   index. When the index is rebuilt, restart as a member of the
+   index. After building the index, restart as a member of the
    replica set, allow it to catch up with the other members of the
-   set, and then rebuild the index on the next secondary. When all the
+   set, and then build the index on the next secondary. When all the
    secondaries have the new index, step down the primary, restart it
    as a standalone, and build the index on the former primary.
 
@@ -563,9 +563,12 @@ when the query must filter along another dimension.
    miles of a given point *and* have a type field value of "museum," a
    haystack index would be provide the best support for these queries.
 
-Haystack indices allow you to tune your bucket size to the
+Haystack indexes allow you to tune your bucket size to the
 distribution of your data, so that in general you search only very
-small regions of 2d space for a particular kind of document.
+small regions of 2d space for a particular kind of document. These
+indexes are not suited for finding the closest documents to a
+particular location, when the closest documents are far away compared
+to bucket size.
 
 Index Limitations
 -----------------
diff --git a/source/includes/note-build-indexes-on-replica-sets.rst b/source/includes/note-build-indexes-on-replica-sets.rst
index 9dc493807df..b7bc5f8431c 100644
--- a/source/includes/note-build-indexes-on-replica-sets.rst
+++ b/source/includes/note-build-indexes-on-replica-sets.rst
@@ -1,4 +1,4 @@
 .. note::
 
-   To rebuild indexes for a :term:`replica set` see
+   To build or rebuild indexes for a :term:`replica set` see
    :ref:`index-rebuilding-replica-sets`.