Skip to content

Commit 8460f2c

Browse files
DOCSP-20767 $linearFill operator (#636)
* wip * initial setup for linearFill * WIP * add replacement * WIP * add example * add missing word * word tweak * bullet format * fix formatting * add copyable false * remove reference to linear * updates per review * word tweak * move placement of example link * clarify field behavior and add new example * fix build warnings * examples fix * tweak * clarification for fill * clarification for fill * tweaks * updates per review * address review feedback * adjust * tweak definition * tweak definition * word tweak
1 parent 557a1e1 commit 8460f2c

File tree

7 files changed

+360
-2
lines changed

7 files changed

+360
-2
lines changed

source/includes/extracts-agg-operators.yaml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1380,6 +1380,14 @@ content: |
13801380
13811381
Available in :pipeline:`$setWindowFields` stage.
13821382
1383+
* - :group:`$linearFill`
1384+
1385+
- .. include:: /includes/fact-linear-fill-description.rst
1386+
1387+
Available in :pipeline:`$setWindowFields` stage.
1388+
1389+
.. versionadded:: 5.3
1390+
13831391
* - :group:`$locf`
13841392
13851393
- .. include:: /includes/fact-locf-description.rst
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
Fills ``null`` and missing fields in a :ref:`window
2+
<setWindowFields-window>` using :wikipedia:`linear interpolation
3+
<Linear_interpolation>` based on surrounding field values.

source/includes/setWindowFields-operators.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ These operators can be used with the :pipeline:`$setWindowFields` stage:
1212

1313
.. _setWindowFields-gap-filling-operators:
1414

15-
- Gap filling operators: :group:`$locf`.
15+
- Gap filling operators: :group:`$linearFill` and :group:`$locf`.
1616

1717
.. _setWindowFields-order-operators:
1818

source/reference/operator/aggregation.txt

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -689,7 +689,12 @@ Alphabetical Listing of Expression Operators
689689
returns the result of the subexpression. Accepts named parameters.
690690

691691
Accepts any number of argument expressions.
692-
692+
693+
* - :group:`$linearFill`
694+
695+
- .. include:: /includes/fact-linear-fill-description.rst
696+
697+
.. versionadded:: 5.3
693698

694699
* - :expression:`$literal`
695700

@@ -1353,6 +1358,7 @@ Alphabetical Listing of Expression Operators
13531358
/reference/operator/aggregation/last-array-element
13541359
/reference/operator/aggregation/lastN-array-element
13551360
/reference/operator/aggregation/let
1361+
/reference/operator/aggregation/linearFill
13561362
/reference/operator/aggregation/literal
13571363
/reference/operator/aggregation/ln
13581364
/reference/operator/aggregation/locf
Lines changed: 324 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,324 @@
1+
=========================
2+
$linearFill (aggregation)
3+
=========================
4+
5+
.. default-domain:: mongodb
6+
7+
.. contents:: On this page
8+
:local:
9+
:backlinks: none
10+
:depth: 1
11+
:class: singlecol
12+
13+
.. |linear-interpolation| replace:: :wikipedia:`linear interpolation <Linear_interpolation>`
14+
15+
Definition
16+
----------
17+
18+
.. group:: $linearFill
19+
20+
.. versionadded:: 5.3
21+
22+
.. include:: /includes/fact-linear-fill-description.rst
23+
24+
:group:`$linearFill` is only available in the
25+
:pipeline:`$setWindowFields` stage.
26+
27+
Syntax
28+
------
29+
30+
The :group:`$linearFill` expression has this syntax:
31+
32+
.. code-block:: none
33+
34+
{ $linearFill: <expression> }
35+
36+
For more information on expressions, see
37+
:ref:`aggregation-expressions`.
38+
39+
Behavior
40+
--------
41+
42+
:group:`$linearFill` fills ``null`` and missing fields using
43+
|linear-interpolation| based on surrounding non-``null`` field values.
44+
The surrounding field values are determined by the sort order specified
45+
in :pipeline:`$setWindowFields`.
46+
47+
- :group:`$linearFill` fills ``null`` and missing values proportionally
48+
spanning the value range between surrounding non-``null`` values. To
49+
determine the values for missing fields, :group:`$linearFill` uses:
50+
51+
- The difference of surrounding non-``null`` values.
52+
53+
- The number of ``null`` fields to fill between the surrounding
54+
values.
55+
56+
- :group:`$linearFill` can fill multiple consecutive ``null`` values if
57+
those values are preceded and followed by non-``null`` values
58+
according to the sort order specified in :pipeline:`$setWindowFields`.
59+
60+
.. example::
61+
62+
If a collection contains these documents:
63+
64+
.. code-block:: javascript
65+
66+
{ index: 0, value: 0 },
67+
{ index: 1, value: null },
68+
{ index: 2, value: null },
69+
{ index: 3, value: null },
70+
{ index: 4, value: 10 }
71+
72+
After using :group:`$linearFill` to fill the ``null`` values, the
73+
documents become:
74+
75+
.. code-block:: javascript
76+
:copyable: false
77+
78+
{ index: 0, value: 0 },
79+
{ index: 1, value: 2.5 },
80+
{ index: 2, value: 5 },
81+
{ index: 3, value: 7.5 },
82+
{ index: 4, value: 10 }
83+
84+
For a complete example, see :ref:`linearFill-example`.
85+
86+
- ``null`` values that are not preceded and followed by non-``null``
87+
values remain ``null``.
88+
89+
Comparison of :pipeline:`$fill` and :group:`$linearFill`
90+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
91+
92+
The :pipeline:`$fill` stage with ``{ method: "linear" }`` and the
93+
:group:`$linearFill` operator both fill missing values using
94+
:wikipedia:`linear interpolation <Linear_interpolation>`.
95+
96+
- When you use the :pipeline:`$fill` stage, the field you fill must be
97+
the same as the field you fill from.
98+
99+
- When you use the :group:`$linearFill` operator inside of a
100+
:pipeline:`$setWindowFields` stage, you can set values for a
101+
different field than the field used as the source data. For an
102+
example, see :ref:`linearFill-example-multiple-methods`.
103+
104+
.. _linearFill-example:
105+
106+
Examples
107+
--------
108+
109+
The examples on this page use a ``stock`` collection that contains
110+
tracks a single company's stock price at hourly intervals:
111+
112+
.. code-block:: javascript
113+
114+
db.stock.insertMany( [
115+
{
116+
time: ISODate("2021-03-08T09:00:00.000Z"),
117+
price: 500
118+
},
119+
{
120+
time: ISODate("2021-03-08T10:00:00.000Z"),
121+
},
122+
{
123+
time: ISODate("2021-03-08T11:00:00.000Z"),
124+
price: 515
125+
},
126+
{
127+
time: ISODate("2021-03-08T12:00:00.000Z")
128+
},
129+
{
130+
time: ISODate("2021-03-08T13:00:00.000Z")
131+
},
132+
{
133+
time: ISODate("2021-03-08T14:00:00.000Z"),
134+
price: 485
135+
}
136+
] )
137+
138+
The ``price`` field is missing for some of the documents in the
139+
collection.
140+
141+
Fill Missing Values with Linear Interpolation
142+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
143+
144+
To populate the missing ``price`` values using |linear-interpolation|,
145+
use :group:`$linearFill` inside of a :pipeline:`$setWindowFields` stage:
146+
147+
.. code-block:: javascript
148+
149+
db.stock.aggregate( [
150+
{
151+
$setWindowFields:
152+
{
153+
sortBy: { time: 1 },
154+
output:
155+
{
156+
price: { $linearFill: "$price" }
157+
}
158+
}
159+
}
160+
] )
161+
162+
In the example:
163+
164+
- ``sortBy: { time: 1 }`` sorts the documents by the ``time`` field in
165+
ascending order, from earliest to latest.
166+
167+
- :ref:`output <setWindowFields-output>` specifies:
168+
169+
- ``price`` as the field for which to fill in missing values.
170+
171+
- ``{ $linearFill: "$price" }`` as the value for the missing field.
172+
:group:`$linearFill` fills missing ``price`` values using
173+
|linear-interpolation| based on the surrounding ``price`` values in
174+
the sequence.
175+
176+
Example output:
177+
178+
.. code-block:: javascript
179+
:copyable: false
180+
:emphasize-lines: 10,20,25
181+
182+
[
183+
{
184+
_id: ObjectId("620ad555394d47411658b5ef"),
185+
time: ISODate("2021-03-08T09:00:00.000Z"),
186+
price: 500
187+
},
188+
{
189+
_id: ObjectId("620ad555394d47411658b5f0"),
190+
time: ISODate("2021-03-08T10:00:00.000Z"),
191+
price: 507.5
192+
},
193+
{
194+
_id: ObjectId("620ad555394d47411658b5f1"),
195+
time: ISODate("2021-03-08T11:00:00.000Z"),
196+
price: 515
197+
},
198+
{
199+
_id: ObjectId("620ad555394d47411658b5f2"),
200+
time: ISODate("2021-03-08T12:00:00.000Z"),
201+
price: 505
202+
},
203+
{
204+
_id: ObjectId("620ad555394d47411658b5f3"),
205+
time: ISODate("2021-03-08T13:00:00.000Z"),
206+
price: 495
207+
},
208+
{
209+
_id: ObjectId("620ad555394d47411658b5f4"),
210+
time: ISODate("2021-03-08T14:00:00.000Z"),
211+
price: 485
212+
}
213+
]
214+
215+
.. _linearFill-example-multiple-methods:
216+
217+
Use Multiple Fill Methods in a Single Stage
218+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
219+
220+
When you use the :pipeline:`$setWindowFields` stage to fill missing
221+
values, you can set values for a different field than the field you
222+
fill from. As a result, you can use multiple fill methods in a single
223+
:pipeline:`$setWindowFields` stage and output the results in distinct
224+
fields.
225+
226+
The following pipeline populates missing ``price`` fields using
227+
|linear-interpolation| and the last-observation-carried-forward method:
228+
229+
.. code-block:: javascript
230+
231+
db.stock.aggregate( [
232+
{
233+
$setWindowFields:
234+
{
235+
sortBy: { time: 1 },
236+
output:
237+
{
238+
linearFillPrice: { $linearFill: "$price" },
239+
locfPrice: { $locf: "$price" }
240+
}
241+
}
242+
}
243+
] )
244+
245+
In the example:
246+
247+
- ``sortBy: { time: 1 }`` sorts the documents by the ``time`` field in
248+
ascending order, from earliest to latest.
249+
250+
- :ref:`output <setWindowFields-output>` specifies:
251+
252+
- ``linearFillPrice`` as a target field to be filled.
253+
254+
- ``{ $linearFill: "$price" }`` is the value for the
255+
``linearFillPrice`` field. :group:`$linearFill` fills missing
256+
``price`` values using |linear-interpolation| based on the
257+
surrounding ``price`` values in the sequence.
258+
259+
- ``locfPrice`` as a target field to be filled.
260+
261+
- ``{ $locf: "$price" }`` is the value for the ``locfPrice`` field.
262+
``locf`` stands for last observation carried forward.
263+
:group:`$locf` fills missing ``price`` values with the value from
264+
the previous document in the sequence.
265+
266+
Example output:
267+
268+
.. code-block:: javascript
269+
:copyable: false
270+
:emphasize-lines: 12,13,25,26,31,32
271+
272+
[
273+
{
274+
_id: ObjectId("620ad555394d47411658b5ef"),
275+
time: ISODate("2021-03-08T09:00:00.000Z"),
276+
price: 500,
277+
linearFillPrice: 500,
278+
locfPrice: 500
279+
},
280+
{
281+
_id: ObjectId("620ad555394d47411658b5f0"),
282+
time: ISODate("2021-03-08T10:00:00.000Z"),
283+
linearFillPrice: 507.5,
284+
locfPrice: 500
285+
},
286+
{
287+
_id: ObjectId("620ad555394d47411658b5f1"),
288+
time: ISODate("2021-03-08T11:00:00.000Z"),
289+
price: 515,
290+
linearFillPrice: 515,
291+
locfPrice: 515
292+
},
293+
{
294+
_id: ObjectId("620ad555394d47411658b5f2"),
295+
time: ISODate("2021-03-08T12:00:00.000Z"),
296+
linearFillPrice: 505,
297+
locfPrice: 515
298+
},
299+
{
300+
_id: ObjectId("620ad555394d47411658b5f3"),
301+
time: ISODate("2021-03-08T13:00:00.000Z"),
302+
linearFillPrice: 495,
303+
locfPrice: 515
304+
},
305+
{
306+
_id: ObjectId("620ad555394d47411658b5f4"),
307+
time: ISODate("2021-03-08T14:00:00.000Z"),
308+
price: 485,
309+
linearFillPrice: 485,
310+
locfPrice: 485
311+
}
312+
]
313+
314+
315+
Restrictions
316+
------------
317+
318+
- To use :group:`$linearFill`, you must use the :ref:`sortBy
319+
<setWindowFields-sortBy>` field to sort your data.
320+
321+
- When using :group:`$linearFill` window function,
322+
:pipeline:`$setWindowFields` returns an error if there are any
323+
repeated values in the :ref:`sortBy <setWindowFields-sortBy>` field
324+
in a single :ref:`partition <setWindowFields-partitionBy>`.

source/reference/operator/aggregation/setWindowFields.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -259,6 +259,8 @@ Restrictions for the :pipeline:`$setWindowFields` stage:
259259
<setWindowFields-documents>` window or a :ref:`range
260260
<setWindowFields-range>` window).
261261

262+
- :group:`$linearFill` operator.
263+
262264
- :ref:`Range <setWindowFields-range>` windows require all :ref:`sortBy
263265
<setWindowFields-sortBy>` values to be numbers.
264266

0 commit comments

Comments
 (0)