Skip to content

DOCSP-13968 update $rand with new examples #4908

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 8, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
161 changes: 79 additions & 82 deletions source/reference/operator/aggregation/rand.txt
Original file line number Diff line number Diff line change
Expand Up @@ -36,134 +36,131 @@ dropped so the actual number of digits may vary.
Examples
--------

This code initializes a ``randomSamples`` collection with 100 documents
that is used in the following examples.
Generate Random Data Points
~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: javascript

N = 100
bulk = db.randomSamples.initializeUnorderedBulkOp()
for ( i = 0; i < N; i++) { bulk.insert( {_id: i, random: 0 } ) }
bulk.execute()


Usage with Update Queries
~~~~~~~~~~~~~~~~~~~~~~~~~

The ``$rand`` operator can be used with update query operations. In
this example :method:`~db.collection.updateMany()` uses the ``$rand``
operator to insert a different random number into each document
in the ``randomSamples`` collection.
This example models charitable donations. The collection starts with a
list of donors.

.. code-block:: javascript

db.randomSamples.updateMany(
{},
db.donors.insertMany(
[
{ $set: { "random": { $rand: {} } } }
{ donorId: 1000, amount: 0, frequency: 1 },
{ donorId: 1001, amount: 0, frequency: 2 },
{ donorId: 1002, amount: 0, frequency: 1 },
{ donorId: 1003, amount: 0, frequency: 2 },
{ donorId: 1004, amount: 0, frequency: 1 }
]
)

We can use :pipeline:`$project` to see the output. The
:pipeline:`$limit` stage halts the pipeline after the third document.
We use an aggregation pipeline to update each document with a random
donation amount.

.. code-block:: javascript

db.randomSamples.aggregate(
db.donors.aggregate(
[
{ $project: {_id: 0, random: 1 } },
{ $limit: 3 }
]
{ $set: { amount: { $multiply: [ { $rand: {} }, 100 ] } } },
{ $set: { amount: { $floor: "$amount" } } },
{ $merge: "donors" }
]
)

The output shows the random values.
The first :pipeline:`$set` stage updates the ``amount`` field. An
initial value between 0 and 1 is generated using ``$rand``. Then
:expression:`$multiply` scales it upward 100 times.

.. code-block:: javascript
:copyable: false

{ "random" : 0.8751284485870464 }
{ "random" : 0.515147067802108 }
{ "random" : 0.3750004525681561 }
The :expression:`$floor` operator in the second ``$set`` stage removes
the decimal portion from the ``amount`` to leave an integer value.

Rounding to Control the Number of Output Digits
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Finally, :pipeline:`$merge` writes the random value created in the
previous steps to the ``amount`` field, updating it for each document
in the ``donors`` collection.

If you want a shorter random value, consider using :expression:`$round`.
Note that the :pipeline:`$set` stage updates the document, if ``$rand``
is called in a :pipeline:`$project` stage the underlying document is
not modified.
You can view the results with a projection stage:

.. code-block:: javascript

db.randomSamples.aggregate(
db.donors.aggregate(
[
{ $match: {} },
{ $set: { rounded: { $round: [ "$random", 4 ] } } },
{ $out: "randomSamples" }
{ $project: {_id: 0, donorId: 1, amount: 1 } }
]
)

The :pipeline:`$project` stage displays the original and rounded value
for each document.

.. code-block:: javascript

db.randomSamples.aggregate(
[
{ $project: {_id:0, random:1, rounded: 1 } },
{ $limit: 3 }
]
)

The update documents look like this:
The projection shows the scaled amounts are now random values in the
range from 0 to 99.

.. code-block:: javascript
:copyable: false

{ "random" : 0.8751284485870464, "rounded" : 0.8751 }
{ "random" : 0.515147067802108, "rounded" : 0.5151 }
{ "random" : 0.3750004525681561, "rounded" : 0.375 }
{ "donorId" : 1000, "amount" : 27 }
{ "donorId" : 1001, "amount" : 10 }
{ "donorId" : 1002, "amount" : 88 }
{ "donorId" : 1003, "amount" : 73 }
{ "donorId" : 1004, "amount" : 5 }

.. note::
Select Random Items From a Collection
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

You can use ``$rand`` in an aggregation pipeline to select random
documents from a collection. Consider a collection of voter records:

Like ``$rand``, the value returned by the ``$round`` operator does
not include any trailing 0s so the number of digits returned may
vary.
.. code-block:: javascript

Selecting Random Items From a Collection
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
db.voters.insertMany(
[
{ name: "Archibald", voterId: 4321, district: 3, registered: true },
{ name: "Beckham", voterId: 4331, district: 3, registered: true },
{ name: "Carolin", voterId: 5321, district: 4, registered: true },
{ name: "Debarge", voterId: 4343, district: 3, registered: false },
{ name: "Eckhard", voterId: 4161, district: 3, registered: false },
{ name: "Faberge", voterId: 4300, district: 1, registered: true },
{ name: "Grimwald", voterId: 4111, district: 3, registered: true },
{ name: "Humphrey", voterId: 2021, district: 3, registered: true },
{ name: "Idelfon", voterId: 1021, district: 4, registered: true },
{ name: "Justo", voterId: 9891, district: 3, registered: false }
]
)

The ``$rand`` operator can be used in an aggregation pipeline to select
random documents from a collection. In this example we use ``$rand`` to
select about half the documents in the ``randomSamples`` collection.
Imagine you want to select about half of the voters in District 3 to do
some polling.

.. code-block:: javascript

db.randomSamples.aggregate(
db.voters.aggregate(
[
{ $match: { district: 3 } },
{ $match: { $expr: { $lt: [0.5, {$rand: {} } ] } } },
{ $count: "numMatches" }
{ $project: { _id: 0, name: 1, registered: 1 } }
]
)

There are 100 documents in ``randomSamples``. Running the sample code 5
times produces the following output which approaches the expected value
of 50 matches in a collection this size.

The first pipeline stage matches all documents where the voter is from
district 3.

The second :pipeline:`$match` stage uses ``$rand`` in a match
expression to further refine the selection. For each document,
``$rand`` generates a value between 0 and 1. The threshhold of ``0.5``
in the less than :expression:`($lt)<$lt>` comparison means that
:query:`$expr` will be true for about half the documents.

In the :pipeline:`$project` stage the selected documents are filtered
to return the ``name`` and ``registered`` fields. There are 7 voters in
District 3, running the code selects about half of them.

.. code-block:: javascript
:copyable: false

{ "numMatches" : 49 }
{ "numMatches" : 52 }
{ "numMatches" : 54 }
{ "numMatches" : 48 }
{ "numMatches" : 59 }
{ "name" : "Archibald", "registered" : true }
{ "name" : "Debarge", "registered" : false }
{ "name" : "Humphrey", "registered" : true }

.. note::

This example shows that the number of documents selected is
different each time. If you need to select an exact number of
documents, consider using :pipeline:`$sample` instead of ``$rand``.
The number of documents selected is different each time. If you need
to select an exact number of documents, consider using
:pipeline:`$sample` instead of ``$rand``.

.. seealso::

Expand Down
115 changes: 97 additions & 18 deletions source/reference/operator/query/rand.txt
Original file line number Diff line number Diff line change
Expand Up @@ -26,36 +26,115 @@ Definition
Examples
--------

This code creates a small collection of 100 documents. We will
use ``$rand`` to select random documents from the collection.
Generate Random Data Points
~~~~~~~~~~~~~~~~~~~~~~~~~~~

This example models charitable donations. The collection starts with a
list of donors.

.. code-block:: javascript

db.donors.insertMany(
[
{ donorId: 1000, amount: 0, frequency: 1 },
{ donorId: 1001, amount: 0, frequency: 2 },
{ donorId: 1002, amount: 0, frequency: 1 },
{ donorId: 1003, amount: 0, frequency: 2 },
{ donorId: 1004, amount: 0, frequency: 1 }
]
)

Then we construct an operation to update each document with a random
donation amount:

.. code-block:: javascript

N = 100
bulk = db.samples.initializeUnorderedBulkOp()
for (i = 0; i < N; i++) { bulk.insert({_id: i, r: 0}) }
bulk.execute()
db.donors.updateMany(
{},
[
{ $set:
{ amount:
{ $floor:
{ $multiply: [ { $rand: {} }, 100 ] }
}
}
}
]
)

In this example we use ``$rand`` to select about half the documents.
The empty update filter matches every document in the collection.

For each document we generate a value between 0 and 1 using ``$rand``
then scale the value with :expression:`$multiply`.

The :expression:`$floor` operator removes the decimal portion so the
updated ``amount`` is an integer value.

After updating the collection, the documents look like this:

.. code-block:: javascript
:copyable: false

{ "donorId" : 1000, "amount" : 2, "frequency" : 1 }
{ "donorId" : 1001, "amount" : 58, "frequency" : 2 }
{ "donorId" : 1002, "amount" : 27, "frequency" : 1 }
{ "donorId" : 1003, "amount" : 26, "frequency" : 2 }
{ "donorId" : 1004, "amount" : 42, "frequency" : 1 }

Select Random Items From a Collection
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The ``$rand`` operator can be used to select random documents from a
collection. Given a collection of voter records:

.. code-block:: javascript

db.voters.insertMany(
[
{ name: "Archibald", voterId: 4321, district: 3, registered: true },
{ name: "Beckham", voterId: 4331, district: 3, registered: true },
{ name: "Carolin", voterId: 5321, district: 4, registered: true },
{ name: "Debarge", voterId: 4343, district: 3, registered: false },
{ name: "Eckhard", voterId: 4161, district: 3, registered: false },
{ name: "Faberge", voterId: 4300, district: 1, registered: true },
{ name: "Grimwald", voterId: 4111, district: 3, registered: true },
{ name: "Humphrey", voterId: 2021, district: 3, registered: true },
{ name: "Idelfon", voterId: 1021, district: 4, registered: true },
{ name: "Justo", voterId: 9891, district: 3, registered: false }
]
)

Imagine you want to select about half of the voters in District 3 to do
some polling.

.. code-block:: javascript

db.voters.find(
{ district: 3,
$expr: { $lt: [0.5, {$rand: {} } ] }
},
{ _id: 0, name: 1, registered: 1 }
)

The intial match on the ``district`` field selects documents where the
voter is from district 3.

db.samples.find(
{ $expr: { $lt: [0.5, {$rand: {} } ] } }
).count()
The :query:`$expr` operator uses ``$rand`` to further refine the
:dbcommand:`$find` operation. For each document, ``$rand`` generates a
value between 0 and 1. The threshold of ``0.5`` means the less than
:expression:`($lt)<$lt>` comparison will be true for about half the
documents in the set.

Running this :dbcommand:`find` operation five times returns five random
values that approach the number 50, which is the expected value for a
collection of this size. For example:
There are 7 voters in District 3, running the code selects about half
of them.

.. code-block:: javascript
:copyable: false

51
53
49
45
47
{ "name" : "Beckham", "registered" : true }
{ "name" : "Eckhard", "registered" : false }
{ "name" : "Grimwald", "registered" : true }
{ "name" : "Humphrey", "registered" : true }

.. seealso::

Expand Down