diff --git a/snooty.toml b/snooty.toml index 5eeb0aa56..16c05bfd4 100644 --- a/snooty.toml +++ b/snooty.toml @@ -13,6 +13,7 @@ toc_landing_pages = [ "/security/authentication", "/data-formats", "/connect/connection-options", + "/aggregation", "/crud", "/crud/update", "/crud/transactions", diff --git a/source/aggregation.txt b/source/aggregation.txt index b4b930edd..5e645a406 100644 --- a/source/aggregation.txt +++ b/source/aggregation.txt @@ -1,9 +1,9 @@ .. _node-aggregation: .. _nodejs-aggregation: -=========== -Aggregation -=========== +====================== +Aggregation Operations +====================== .. meta:: :description: Learn to use aggregation operations in the MongoDB Node.js Driver to create pipelines for data transformation and summarization. @@ -15,18 +15,27 @@ Aggregation :depth: 2 :class: singlecol +.. toctree:: + :titlesonly: + :maxdepth: 1 + + Pipeline Stages + .. _nodejs-aggregation-overview: Overview -------- -In this guide, you can learn how to use **aggregation operations** in -the MongoDB Node.js driver. +In this guide, you can learn how to use the {+driver-long+} to perform +**aggregation operations**. + +Aggregation operations process data in your MongoDB collections and return +computed results. The MongoDB Aggregation framework is modeled on the concept of +data processing pipelines. Documents enter a pipeline comprised of one or more +stages, and this pipeline transforms the documents into an aggregated result. -Aggregation operations are expressions you can use to produce reduced -and summarized results in MongoDB. MongoDB's aggregation framework -allows you to create a pipeline that consists of one or more stages, -each of which performs a specific operation on your data. +To learn more about the aggregation stages supported by the {+driver-short+}, +see :ref:`node-aggregation-pipeline-stages`. .. _node-aggregation-tutorials: @@ -39,114 +48,67 @@ each of which performs a specific operation on your data. Analogy ~~~~~~~ -You can think of the aggregation pipeline as similar to an automobile factory. -Automobile manufacturing requires the use of assembly stations organized -into assembly lines. Each station has specialized tools, such as -drills and welders. The factory transforms and -assembles the initial parts and materials into finished products. - -The **aggregation pipeline** is the assembly line, **aggregation -stages** are the assembly stations, and **expression operators** are the -specialized tools. - -Comparing Aggregation and Query Operations -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Using query operations, such as the ``find()`` method, you can perform the following actions: - -- Select *which documents* to return -- Select *which fields* to return -- Sort the results - -Using aggregation operations, you can perform the following actions: - -- Perform all query operations -- Rename fields -- Calculate fields -- Summarize data -- Group values - -Aggregation operations have some :manual:`limitations `: - -- Returned documents must not violate the :manual:`BSON-document size limit ` - of 16 megabytes. - -- Pipeline stages have a memory limit of 100 megabytes by default. You can exceed this - limit by setting the ``allowDiskUse`` property of ``AggregateOptions`` to ``true``. See - the `AggregateOptions API documentation <{+api+}/interfaces/AggregateOptions.html>`__ - for more details. - -.. important:: $graphLookup exception - - The :manual:`$graphLookup - ` stage has a strict - memory limit of 100 megabytes and will ignore ``allowDiskUse``. - -References -~~~~~~~~~~ - -To view a full list of expression operators, see :manual:`Aggregation -Operators ` in the Server manual. - -To learn about assembling an aggregation pipeline and view examples, see -:manual:`Aggregation Pipeline ` in the -Server manual. - -To learn more about creating pipeline stages, see :manual:`Aggregation -Stages ` in the Server manual. - -Runnable Examples ------------------ - -The example uses sample data about restaurants. The following code -inserts data into the ``restaurants`` collection of the ``aggregation`` -database: - -.. literalinclude:: /code-snippets/aggregation/agg.js - :start-after: begin data insertion - :end-before: end data insertion - :language: javascript - :dedent: - -.. tip:: - - For more information on connecting to your MongoDB deployment, see the :doc:`Connection Guide `. - -Aggregation Example -~~~~~~~~~~~~~~~~~~~ - -To perform an aggregation, pass a list of aggregation stages to the -``collection.aggregate()`` method. - -In the example, the aggregation pipeline uses the following aggregation stages: - -- A :manual:`$match ` stage to filter for documents whose - ``categories`` array field contains the element ``Bakery``. - -- A :manual:`$group ` stage to group the matching documents by the ``stars`` - field, accumulating a count of documents for each distinct value of ``stars``. - -.. literalinclude:: /code-snippets/aggregation/agg.js - :start-after: begin aggregation - :end-before: end aggregation - :language: javascript - :dedent: - -This example produces the following output: - -.. code-block:: json - :copyable: false - - { _id: 4, count: 2 } - { _id: 3, count: 1 } - { _id: 5, count: 1 } - -For more information, see the `aggregate() API documentation <{+api+}/classes/Collection.html#aggregate>`__. - -Additional Examples -~~~~~~~~~~~~~~~~~~~ - -You can find another aggregation pipeline example in the `Aggregation -Framework with Node.js Tutorial -`_ -blog post on the MongoDB website. +The aggregation pipeline is similar to an automobile factory assembly line. An +assembly line has stations with specialized tools that are used to perform +specific tasks. For example, when building a car, the assembly line begins with +a frame. As the car frame moves though the assembly line, each station assembles +a separate part. The result is a transformed final product, the finished car. + +The *aggregation pipeline* is the assembly line, the *aggregation stages* are +the assembly stations, the *expression operators* are the specialized tools, and +the *aggregated result* is the finished product. + +Compare Aggregation and Find Operations +--------------------------------------- + +The following table lists the different tasks you can perform with find +operations compared to what you can achieve with aggregation operations. The +aggregation framework provides expanded functionality that allows you to +transform and manipulate your data. + +.. list-table:: + :header-rows: 1 + :widths: 50 50 + + * - Find Operations + - Aggregation Operations + + * - | Select *certain* documents to return + | Select *which* fields to return + | Sort the results + | Limit the results + | Count the results + - | Select *certain* documents to return + | Select *which* fields to return + | Sort the results + | Limit the results + | Count the results + | Group the results + | Rename fields + | Compute new fields + | Summarize data + | Connect and merge data sets + +Server Limitations +------------------ + +Consider the following :manual:`limitations +` when performing aggregation operations: + +- Returned documents must not violate the :manual:`BSON document size limit + ` of 16 megabytes. +- Pipeline stages have a memory limit of 100 megabytes by default. If required, + you can exceed this limit by enabling the `AllowDiskUse + `__ + property of the ``AggregateOptions`` object that you pass to the + ``aggregate()`` method. + +Additional information +---------------------- + +To view a full list of expression operators, see :manual:`Aggregation Operators +` in the {+mdb-server+} manual. + +To learn about explaining MongoDB aggregation operations, see :manual:`Explain +Results ` and :manual:`Query Plans +` in the {+mdb-server+} manual. diff --git a/source/aggregation/pipeline-stages.txt b/source/aggregation/pipeline-stages.txt new file mode 100644 index 000000000..eb60fde27 --- /dev/null +++ b/source/aggregation/pipeline-stages.txt @@ -0,0 +1,319 @@ +.. _node-aggregation-pipeline-stages: + +=========================== +Aggregation Pipeline Stages +=========================== + +.. contents:: On this page + :local: + :backlinks: none + :depth: 2 + :class: singlecol + +.. facet:: + :name: genre + :values: reference + +.. meta:: + :keywords: node.js, code example, transform, pipeline + :description: Learn the different possible stages of the aggregation pipeline in the Node.js Driver. + +Overview +------------ + +In this guide, you can learn how to create an aggregation pipeline and pipeline +stages by using methods in the {+driver-long+}. + +Build an Aggregation Pipeline +----------------------------- + +You can use the {+driver-short+} to build an aggregation pipeline by creating a +pipeline variable or passing aggregation stages directly into the aggregation +method. See the following examples to learn more about each of these approaches. + +.. tabs:: + + .. tab:: Create a Pipeline + :tabid: pipeline-definition + + .. code-block:: javascript + + // Defines the aggregation pipeline + const pipeline = [ + { $match: { ... } }, + { $group: { ... } } + ]; + + // Executes the aggregation pipeline + const results = await collection.aggregate(pipeline); + + .. tab:: Direct Aggregation + :tabid: pipeline-direct + + .. code-block:: javascript + + // Defines and executes the aggregation pipeline + const results = await collection.aggregate([ + { $match: { ... } }, + { $group: { ... } } + ]); + +Aggregation Stage Methods +------------------------- + +The following table lists the stages in the aggregation pipeline. To learn more +about an aggregation stage and see a code example in a {+environment+} application, +follow the link from the stage name to its reference page in the {+mdb-server+} +manual. + +.. list-table:: + :header-rows: 1 + :widths: 30 70 + + * - Stage + - Description + + * - :manual:`$addFields ` + - Adds new fields to documents. Outputs documents that contain both the + existing fields from the input documents and the newly added fields. + + ``$set`` is an alias for ``$addFields``. + + * - :manual:`$bucket ` + - Categorizes incoming documents into groups, called buckets, + based on a specified expression and bucket boundaries. + + * - :manual:`$bucketAuto ` + - Categorizes incoming documents into a specific number of + groups, called buckets, based on a specified expression. + Bucket boundaries are automatically determined in an attempt + to evenly distribute the documents into the specified number + of buckets. + + * - :manual:`$changeStream ` + - Returns a change stream cursor for the collection. Must be the first stage + in the pipeline. + + ``$changeStream`` returns an ``AggregationCursor`` when passed to the + ``aggregate()`` method and a ``ChangeStreamCursor`` when passed to the + ``watch()`` method. + + * - :manual:`$changeStreamSplitLargeEvent + ` + - Splits large change stream events that exceed 16 MB into smaller fragments returned + in a change stream cursor. Must be the last stage in the pipeline. + + ``$changeStreamSplitLargeEvent`` returns an ``AggregationCursor`` when + passed to the ``aggregate()`` method and a ``ChangeStreamCursor`` when + passed to the ``watch()`` method. + + * - :manual:`$collStats ` + - Returns statistics regarding a collection or view. + + * - :manual:`$count ` + - Returns a count of the number of documents at this stage of + the aggregation pipeline. + + * - :manual:`$currentOp ` + - Returns a stream of documents containing information on active and + dormant operations and any inactive sessions that are holding locks as + part of a transaction. + + * - :manual:`$densify ` + - Creates new documents in a sequence of documents where certain values in a + field are missing. + + * - :manual:`$documents ` + - Returns literal documents from input expressions. + + * - :manual:`$facet ` + - Processes multiple aggregation pipelines + within a single stage on the same set + of input documents. Enables the creation of multi-faceted + aggregations capable of characterizing data across multiple + dimensions, or facets, in a single stage. + + * - :manual:`$geoNear ` + - Returns documents in order of nearest to farthest from a + specified point. This method adds a field to output documents + that contains the distance from the specified point. + + * - :manual:`$graphLookup ` + - Performs a recursive search on a collection. This method adds + a new array field to each output document that contains the traversal + results of the recursive search for that document. + + * - :manual:`$group ` + - Groups input documents by a specified identifier expression and applies + the accumulator expressions, if specified, to each group. Consumes all + input documents and outputs one document per each distinct group. The + output documents contain only the identifier field and, if specified, + accumulated fields. + + * - :manual:`$indexStats ` + - Returns statistics regarding the use of each index for the collection. + + * - :manual:`$limit ` + - Passes the first *n* documents unmodified to the pipeline, where *n* is + the specified limit. For each input document, outputs either one document + (for the first *n* documents) or zero documents (after the first *n* + documents). + + * - :manual:`$listSampledQueries ` + - Lists sampled queries for all collections or a specific collection. Only + available for collections with :manual:`Queryable Encryption + ` enabled. + + * - :manual:`$listSearchIndexes ` + - Returns information about existing :ref:`Atlas Search indexes + ` on a specified collection. + + * - :manual:`$lookup ` + - Performs a left outer join to another collection in the + *same* database to filter in documents from the "joined" + collection for processing. + + * - :manual:`$match ` + - Filters the document stream to allow only matching documents + to pass unmodified into the next pipeline stage. + For each input document, outputs either one document (a match) or zero + documents (no match). + + * - :manual:`$merge ` + - Writes the resulting documents of the aggregation pipeline to + a collection. The stage can incorporate (insert new + documents, merge documents, replace documents, keep existing + documents, fail the operation, process documents with a + custom update pipeline) the results into an output + collection. To use this stage, it must be + the last stage in the pipeline. + + * - :manual:`$out ` + - Writes the resulting documents of the aggregation pipeline to + a collection. To use this stage, it must be + the last stage in the pipeline. + + * - :manual:`$project ` + - Reshapes each document in the stream, such as by adding new + fields or removing existing fields. For each input document, + outputs one document. + + * - :manual:`$redact ` + - Reshapes each document in the stream by restricting the content for each + document based on information stored in the documents themselves. + Incorporates the functionality of ``$project`` and ``$match``. Can be used + to implement field level redaction. For each input document, outputs + either one or zero documents. + + * - :manual:`$replaceRoot ` + - Replaces a document with the specified embedded document. The + operation replaces all existing fields in the input document, + including the ``_id`` field. Specify a document embedded in + the input document to promote the embedded document to the + top level. + + The ``$replaceWith`` stage is an alias for the ``$replaceRoot`` stage. + + * - :manual:`$replaceWith ` + - Replaces a document with the specified embedded document. + The operation replaces all existing fields in the input document, including + the ``_id`` field. Specify a document embedded in the input document to promote + the embedded document to the top level. + + The ``$replaceWith`` stage is an alias for the ``$replaceRoot`` stage. + + * - :manual:`$sample ` + - Randomly selects the specified number of documents from its + input. + + * - :manual:`$search ` + - Performs a full-text search of the field or fields in an + :atlas:`Atlas ` + collection. + + This stage is available only for MongoDB Atlas clusters, and is not + available for self-managed deployments. To learn more, see + :atlas:`Atlas Search Aggregation Pipeline Stages + ` in the Atlas documentation. + + * - :manual:`$searchMeta ` + - Returns different types of metadata result documents for the + :atlas:`Atlas Search ` query against an + :atlas:`Atlas ` + collection. + + This stage is available only for MongoDB Atlas clusters, + and is not available for self-managed deployments. To learn + more, see :atlas:`Atlas Search Aggregation Pipeline Stages + ` in the Atlas documentation. + + * - :manual:`$set ` + - Adds new fields to documents. Like the ``Project()`` method, + this method reshapes each + document in the stream by adding new fields to + output documents that contain both the existing fields + from the input documents and the newly added fields. + + * - :manual:`$setWindowFields ` + - Groups documents into windows and applies one or more + operators to the documents in each window. + + * - :manual:`$skip ` + - Skips the first *n* documents, where *n* is the specified skip + number, and passes the remaining documents unmodified to the + pipeline. For each input document, outputs either zero + documents (for the first *n* documents) or one document (if + after the first *n* documents). + + * - :manual:`$sort ` + - Reorders the document stream by a specified sort key. The documents remain unmodified. + For each input document, outputs one document. + + * - :manual:`$sortByCount ` + - Groups incoming documents based on the value of a specified + expression, then computes the count of documents in each + distinct group. + + * - :manual:`$unionWith ` + - Combines pipeline results from two collections into a single + result set. + + * - :manual:`$unset ` + - Removes/excludes fields from documents. + + ``$unset`` is an alias for ``$project`` that removes fields. + + * - :manual:`$unwind ` + - Deconstructs an array field from the input documents to + output a document for *each* element. Each output document + replaces the array with an element value. For each input + document, outputs *n* Documents, where *n* is the number of + array elements. *n* can be zero for an empty array. + + * - :manual:`$vectorSearch ` + - Performs an :abbr:`ANN (Approximate Nearest Neighbor)` or + :abbr:`ENN (Exact Nearest Neighbor)` search on a + vector in the specified field of an + :atlas:`Atlas ` collection. + + This stage is available only for MongoDB Atlas clusters, and is not + available for self-managed deployments. To learn more, see + :ref:`Atlas Vector Search `. + +API Documentation +~~~~~~~~~~~~~~~~~ + +To learn more about assembling an aggregation pipeline, see :manual:`Aggregation +Pipeline ` in the {+mdb-server+} manual. + +To learn more about creating pipeline stages, see :manual:`Aggregation Stages +` in the {+mdb-server+} manual. + +For more information about the methods and classes used on this page, see the +following API documentation: + +- `Collection <{+api+}/classes/Collection.html>`__ +- `aggregate() <{+api+}/classes/Collection.html#aggregate>`__ +- `watch() <{+api+}/classes/Collection.html#watch>`__ +- `AggregateOptions <{+api+}/interfaces/AggregateOptions.html>`__ +