@@ -6,6 +6,12 @@ Supported Partition Attribute Types
6
6
7
7
.. default-domain:: mongodb
8
8
9
+ .. contents:: On this page
10
+ :local:
11
+ :backlinks: none
12
+ :depth: 2
13
+ :class: singlecol
14
+
9
15
The following table lists the supported data types for partition attributes and
10
16
an example :datalakeconf:`~databases.[n].collections.[n].dataSources.[n].path`
11
17
for each data type:
@@ -35,6 +41,10 @@ for each data type:
35
41
In the above ``path`` examples, ``phone`` is interpreted
36
42
as a string.
37
43
44
+ .. seealso::
45
+
46
+ :ref:`parse-null-values`
47
+
38
48
* - ``int``
39
49
- Parses the filename as an integer.
40
50
- filename: ``/zipcodes/90210.json``
@@ -44,6 +54,10 @@ for each data type:
44
54
In the above example, ``zipcode`` is interpreted
45
55
as an integer.
46
56
57
+ .. seealso::
58
+
59
+ :ref:`parse-padded-numeric-values`
60
+
47
61
* - ``isodate``
48
62
- Parses the filename in `RFC 3339 <https://tools.ietf.org/html/rfc3339>`_
49
63
format as an ISO-8601 format date.
@@ -89,6 +103,10 @@ for each data type:
89
103
In the above example, ``startTimestamp`` is interpreted
90
104
as a Unix timestamp in seconds.
91
105
106
+ .. seealso::
107
+
108
+ :ref:`parse-padded-numeric-values`
109
+
92
110
* - ``epoch_millis``
93
111
- Parses the filename as a Unix timestamp in milliseconds.
94
112
- filename: ``/metrics/1549046112000.json``
@@ -98,6 +116,10 @@ for each data type:
98
116
In the above example, ``startTimestamp`` is interpreted
99
117
as a Unix timestamp in milliseconds.
100
118
119
+ .. seealso::
120
+
121
+ :ref:`parse-padded-numeric-values`
122
+
101
123
* - ``objectid``
102
124
- Parses the filename as an
103
125
:manual:`ObjectId </reference/method/ObjectId/>`.
@@ -123,7 +145,9 @@ for each data type:
123
145
{+adl+} supports the `Package Syntax
124
146
<https://golang.org/pkg/regexp/syntax/>`__ for regular expressions
125
147
in the path to the filename.
126
-
148
+
149
+ .. _parse-null-values:
150
+
127
151
Parsing Null Values from Filenames
128
152
----------------------------------
129
153
@@ -142,3 +166,31 @@ attribute types except ``string``. For example, consider the following |s3|
142
166
For the path ``/records/{month string}/*``, {+dl+} does not add any
143
167
computed fields for the ``month`` attribute to documents generated
144
168
from the third record in the above store.
169
+
170
+ .. _parse-padded-numeric-values:
171
+
172
+ Parsing Padded Numbers from Filenames
173
+ -------------------------------------
174
+
175
+ For attribute types like ``int``, ``epoch_millis``, and ``epoch_secs``,
176
+ if you want {+dl+} to correctly parse numeric values that are padded
177
+ with leading zeros in the path to the file, specify the number
178
+ of digits in the padded value using regular expressions. For example,
179
+ consider a |s3| store with the following files:
180
+
181
+ .. code-block:: text
182
+ :copyable: false
183
+
184
+ |--users
185
+ |--001.json
186
+ |--002.json
187
+ ...
188
+
189
+ The following ``path`` syntax uses a regular expression with the
190
+ ``int`` attribute type to specify the number of digits in the
191
+ filename:
192
+
193
+ .. code-block:: sh
194
+ :copyable: false
195
+
196
+ /users/{user_id int:\\d{3}}
0 commit comments