From 8a41773d7d909aae3e897c9e04d9832618e02d69 Mon Sep 17 00:00:00 2001 From: kay Date: Thu, 13 Sep 2012 15:57:44 -0400 Subject: [PATCH 1/3] DOCS-489 aggregation operators and indexes --- source/applications/aggregation.txt | 30 ++++++++++++++++++++++++--- source/reference/aggregation/sort.txt | 10 +++++++++ 2 files changed, 37 insertions(+), 3 deletions(-) diff --git a/source/applications/aggregation.txt b/source/applications/aggregation.txt index 72ee335a088..455aadbbcf8 100644 --- a/source/applications/aggregation.txt +++ b/source/applications/aggregation.txt @@ -174,20 +174,44 @@ Document size ` limit, which is currently 16 megabytes Optimizing Performance ---------------------- -Early Filtering -~~~~~~~~~~~~~~~ - Because you will always call :method:`aggregate` on a :term:`collection` object, which logically inserts the *entire* collection into the aggregation pipeline, you may want to optimize the operation by avoiding scanning the entire collection whenever possible. +Aggregation Operators and Indexes +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Depending on the order in which they appear in the pipeline, +aggregation operators can take advantage of indexes. + +The following aggregation operators: + +- :agg:pipeline:`$match` +- :agg:pipeline:`$sort` +- :agg:pipeline:`$limit` +- :agg:pipeline:`$skip` + +can take advantage of an index when placed at the **beginning** of the pipleline or +placed **before** the following aggregation operators: + +- :agg:pipeline:`$project` +- :agg:pipeline:`$unwind` +- :agg:pipeline:`$group`. + +Early Filtering +~~~~~~~~~~~~~~~ + If your aggregation operation requires only a subset of the data in a collection, use the :agg:pipeline:`$match` operator to restrict which items go in to the top of the pipeline, as in a query. When placed early in a pipeline, these :agg:pipeline:`$match` operations use suitable indexes to scan only the matching documents in a collection. +Placing :agg:pipeline:`$match` then :agg:pipeline:`$sort` at the very +start of the pipeline would be logically equivalent to a single query +and a sort, and be able to use an index. + .. OMMITED: this feature is pending SERVER-4506. Other optimizations .. are pending SERVER-4507 SERVER-4644 SERVER-4656 SERVER-4816 .. diff --git a/source/reference/aggregation/sort.txt b/source/reference/aggregation/sort.txt index 703a2c84a3c..8e2c051b695 100644 --- a/source/reference/aggregation/sort.txt +++ b/source/reference/aggregation/sort.txt @@ -41,6 +41,16 @@ $sort .. TODO mention the importance of order preserving objects + - :agg:pipeline:`$skip` + + :pipeline:`$sort` operator can take advantage of an index when + placed at the **beginning** of the pipleline or placed **before** + the following aggregation operators: + + - :agg:pipeline:`$project` + - :agg:pipeline:`$unwind` + - :agg:pipeline:`$group`. + .. warning:: Unless the :pipeline:`$sort` operator can use an index, in the current release, the sort must fit within memory. This may cause problems when sorting large numbers of documents. From 197b16be1d9c40ba7617e83a20c0d0acc6887405 Mon Sep 17 00:00:00 2001 From: kay Date: Thu, 13 Sep 2012 16:23:20 -0400 Subject: [PATCH 2/3] DOCS-489 aggregation operators and indexes --- source/applications/aggregation.txt | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/source/applications/aggregation.txt b/source/applications/aggregation.txt index 455aadbbcf8..7c3088b8bb6 100644 --- a/source/applications/aggregation.txt +++ b/source/applications/aggregation.txt @@ -179,21 +179,22 @@ Because you will always call :method:`aggregate` on a the aggregation pipeline, you may want to optimize the operation by avoiding scanning the entire collection whenever possible. -Aggregation Operators and Indexes -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Pipeline Operators and Indexes +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Depending on the order in which they appear in the pipeline, aggregation operators can take advantage of indexes. -The following aggregation operators: +The following pipeline operators take advantage of an index when they +occur at the beginning of the pipeline: - :agg:pipeline:`$match` - :agg:pipeline:`$sort` - :agg:pipeline:`$limit` - :agg:pipeline:`$skip` -can take advantage of an index when placed at the **beginning** of the pipleline or -placed **before** the following aggregation operators: +The above operators can also use an index when placed **before** the +following aggregation operators: - :agg:pipeline:`$project` - :agg:pipeline:`$unwind` From 89a51a63d9e34bee950ef5a2c23114de6148da03 Mon Sep 17 00:00:00 2001 From: kay Date: Thu, 13 Sep 2012 17:34:17 -0400 Subject: [PATCH 3/3] DOCS-489 aggregation operators and indexes --- source/applications/aggregation.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/source/applications/aggregation.txt b/source/applications/aggregation.txt index 7c3088b8bb6..5defde0ac91 100644 --- a/source/applications/aggregation.txt +++ b/source/applications/aggregation.txt @@ -190,8 +190,8 @@ occur at the beginning of the pipeline: - :agg:pipeline:`$match` - :agg:pipeline:`$sort` -- :agg:pipeline:`$limit` -- :agg:pipeline:`$skip` +- :agg:pipeline:`$limit` +- :agg:pipeline:`$skip`. The above operators can also use an index when placed **before** the following aggregation operators: