@@ -384,17 +384,74 @@ option, the ``explain`` output shows the coalesced stage:
384
384
}
385
385
386
386
Indexes
387
- -------
387
+ ~~~~~~~
388
+
389
+ An aggregation pipeline can use :ref:`indexes <indexes>` from the input
390
+ collection to improve performance. Using an index limits the amount of
391
+ documents a stage processes. Ideally, an index can :ref:`cover
392
+ <read-operations-covered-query>` the stage query. A covered query has
393
+ especiallly high performance, since the index returns all matching
394
+ documents.
395
+
396
+ For example, a pipeline that consists of :pipeline:`$match`,
397
+ :pipeline:`$sort`, :pipeline:`$group` can benefit from indexes at
398
+ every stage:
399
+
400
+ - An index on the :pipeline:`$match` query field can efficiently
401
+ identify the relevant data
388
402
389
- Starting in MongoDB 4.2, in some cases, an aggregation pipeline can use
390
- a ``DISTINCT_SCAN`` index plan that returns one document per index key
391
- value.
403
+ - An index on the sorting field can return data in sorted order for the
404
+ :pipeline:`$sort` stage
405
+
406
+ - An index on the grouping field that matches the :pipeline:`$sort`
407
+ order can return all of the field values needed to execute the
408
+ :pipeline:`$group` stage (a covered query)
409
+
410
+ To determine whether a pipeline uses indexes, review the query plan and
411
+ look for ``IXSCAN`` or ``DISTINCT_SCAN`` plans.
392
412
393
413
.. note::
394
- ``DISTINCT_SCAN`` executes faster than ``IXSCAN`` if multiple
395
- documents per index value exist. However, index scan parameters
396
- might affect the time comparison of ``DISTINCT_SCAN`` and
397
- ``IXSCAN``.
414
+ In some cases, the query planner uses a ``DISTINCT_SCAN`` index plan
415
+ that returns one document per index key value. ``DISTINCT_SCAN``
416
+ executes faster than ``IXSCAN`` if there are multiple documents per
417
+ key value. However, index scan parameters might affect the time
418
+ comparison of ``DISTINCT_SCAN`` and ``IXSCAN``.
419
+
420
+ For early stages in your aggregation pipeline, consider indexing the
421
+ query fields. Stages that can benefit from indexes are:
422
+
423
+ ``$match`` stage
424
+ :pipeline:`$match` can use an index to filter documents if it is the
425
+ first stage in the pipeline, after any optimizations from the
426
+ :ref:`query planner <query-plans-query-optimization>`.
427
+
428
+ ``$sort`` stage
429
+ :pipeline:`$sort` can benefit from an index as long as it is not
430
+ preceded by a :pipeline:`$project`, :pipeline:`$unwind`, or
431
+ :pipeline:`$group` stage.
432
+
433
+ ``$group`` stage
434
+ :pipeline:`$group` can use an index to find the first document in
435
+ each group if it meets all of the following conditions:
436
+
437
+ - a :pipeline:`$sort` stage sorts the grouping field before
438
+ :pipeline:`$group`
439
+
440
+ - an index exists that matches the sort order on the grouped field
441
+
442
+ - :group:`$first` is the only accumulator in the :pipeline:`$group`
443
+ stage
444
+
445
+ See :ref:`$group Performance Optimizations <group-pipeline-optimization>`
446
+ for an example.
447
+
448
+ ``$geoNear`` stage
449
+ :pipeline:`$geoNear` always uses an index, since it must be the first
450
+ stage in a pipeline and requires a :ref:`geospatial index <index-feature-geospatial>`.
451
+
452
+ Additionally, stages later in the pipeline that retrieve data from
453
+ other, unmodified collections can use indexes on those collections
454
+ for optimization. These stages include:
398
455
399
456
Indexes can :ref:`cover <read-operations-covered-query>` queries in an
400
457
aggregation pipeline. A covered query uses an index to return all of the
@@ -438,4 +495,4 @@ MongoDB increases the :pipeline:`$limit` amount with the reordering.
438
495
.. seealso::
439
496
440
497
:method:`explain <db.collection.aggregate()>` option in the
441
- :method:`db.collection.aggregate()`
498
+ :method:`db.collection.aggregate()`
0 commit comments