Skip to content

Commit e6329c8

Browse files
Updated index optimization (#2024)
1 parent 5e29f51 commit e6329c8

File tree

1 file changed

+66
-9
lines changed

1 file changed

+66
-9
lines changed

source/core/aggregation-pipeline-optimization.txt

Lines changed: 66 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -384,17 +384,74 @@ option, the ``explain`` output shows the coalesced stage:
384384
}
385385

386386
Indexes
387-
-------
387+
~~~~~~~
388+
389+
An aggregation pipeline can use :ref:`indexes <indexes>` from the input
390+
collection to improve performance. Using an index limits the amount of
391+
documents a stage processes. Ideally, an index can :ref:`cover
392+
<read-operations-covered-query>` the stage query. A covered query has
393+
especiallly high performance, since the index returns all matching
394+
documents.
395+
396+
For example, a pipeline that consists of :pipeline:`$match`,
397+
:pipeline:`$sort`, :pipeline:`$group` can benefit from indexes at
398+
every stage:
399+
400+
- An index on the :pipeline:`$match` query field can efficiently
401+
identify the relevant data
388402

389-
Starting in MongoDB 4.2, in some cases, an aggregation pipeline can use
390-
a ``DISTINCT_SCAN`` index plan that returns one document per index key
391-
value.
403+
- An index on the sorting field can return data in sorted order for the
404+
:pipeline:`$sort` stage
405+
406+
- An index on the grouping field that matches the :pipeline:`$sort`
407+
order can return all of the field values needed to execute the
408+
:pipeline:`$group` stage (a covered query)
409+
410+
To determine whether a pipeline uses indexes, review the query plan and
411+
look for ``IXSCAN`` or ``DISTINCT_SCAN`` plans.
392412

393413
.. note::
394-
``DISTINCT_SCAN`` executes faster than ``IXSCAN`` if multiple
395-
documents per index value exist. However, index scan parameters
396-
might affect the time comparison of ``DISTINCT_SCAN`` and
397-
``IXSCAN``.
414+
In some cases, the query planner uses a ``DISTINCT_SCAN`` index plan
415+
that returns one document per index key value. ``DISTINCT_SCAN``
416+
executes faster than ``IXSCAN`` if there are multiple documents per
417+
key value. However, index scan parameters might affect the time
418+
comparison of ``DISTINCT_SCAN`` and ``IXSCAN``.
419+
420+
For early stages in your aggregation pipeline, consider indexing the
421+
query fields. Stages that can benefit from indexes are:
422+
423+
``$match`` stage
424+
:pipeline:`$match` can use an index to filter documents if it is the
425+
first stage in the pipeline, after any optimizations from the
426+
:ref:`query planner <query-plans-query-optimization>`.
427+
428+
``$sort`` stage
429+
:pipeline:`$sort` can benefit from an index as long as it is not
430+
preceded by a :pipeline:`$project`, :pipeline:`$unwind`, or
431+
:pipeline:`$group` stage.
432+
433+
``$group`` stage
434+
:pipeline:`$group` can use an index to find the first document in
435+
each group if it meets all of the following conditions:
436+
437+
- a :pipeline:`$sort` stage sorts the grouping field before
438+
:pipeline:`$group`
439+
440+
- an index exists that matches the sort order on the grouped field
441+
442+
- :group:`$first` is the only accumulator in the :pipeline:`$group`
443+
stage
444+
445+
See :ref:`$group Performance Optimizations <group-pipeline-optimization>`
446+
for an example.
447+
448+
``$geoNear`` stage
449+
:pipeline:`$geoNear` always uses an index, since it must be the first
450+
stage in a pipeline and requires a :ref:`geospatial index <index-feature-geospatial>`.
451+
452+
Additionally, stages later in the pipeline that retrieve data from
453+
other, unmodified collections can use indexes on those collections
454+
for optimization. These stages include:
398455

399456
Indexes can :ref:`cover <read-operations-covered-query>` queries in an
400457
aggregation pipeline. A covered query uses an index to return all of the
@@ -438,4 +495,4 @@ MongoDB increases the :pipeline:`$limit` amount with the reordering.
438495
.. seealso::
439496

440497
:method:`explain <db.collection.aggregate()>` option in the
441-
:method:`db.collection.aggregate()`
498+
:method:`db.collection.aggregate()`

0 commit comments

Comments
 (0)