Skip to content

Commit cb605f1

Browse files
DOCSP-18193 Add a Wildcard Configuration section (#191)
* DOCSP-18193 Add a Wildcard Configuration section * Apply suggestions from code review Co-authored-by: Melissa Mahoney <[email protected]> * DOCSP-18193 updates for review feedback * DOCSP-18193 minor correction Co-authored-by: Melissa Mahoney <[email protected]>
1 parent a66fd0d commit cb605f1

File tree

1 file changed

+221
-26
lines changed

1 file changed

+221
-26
lines changed

source/config/config-data-lake.txt

Lines changed: 221 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -23,17 +23,8 @@ format, see :ref:`datalake-configuration-file`.
2323

2424
You can retrieve and update the {+data-lake-short+} configuration by
2525
:ref:`connecting <gst-connect-adl>` a :binary:`~bin.mongo` shell to the
26-
{+dl+}. You can also update your {+dl+} from the |service| UI:
27-
28-
1. From the |service| UI, select :guilabel:`Data Lake` from the
29-
left-hand navigation.
30-
31-
#. Click :guilabel:`Configuration` for the {+data-lake-short+} that you
32-
want to update.
33-
34-
#. Make necessary changes to the :ref:`storage configuration
35-
<datalake-configuration-file>` and click :guilabel:`Save` for the
36-
changes to take effect.
26+
{+dl+}. You can also update your {+dl+} from the |service| UI. See
27+
:ref:`datalake-setstorageconfig` for more information.
3728

3829
.. note::
3930

@@ -64,30 +55,37 @@ Set or Update {+data-lake-short+} Configuration
6455
-----------------------------------------------
6556

6657
Once connected to the {+data-lake-short+}, you can use the following
67-
database commands to set or update the {+data-lake-short+} configuration:
58+
database commands to set or update the {+data-lake-short+}
59+
configuration:
6860

6961
.. code-block:: javascript
7062

7163
use admin
7264
db.runCommand( { "storageSetConfig" : <config> } )
7365

74-
Replace ``<config>`` with the {+data-lake-short+} configuration. For complete
75-
documentation on the configuration fields and format, see
76-
:ref:`datalake-configuration-format`. You can validate your :ref:`configuration
77-
<datalake-configuration-file>` before setting or updating the
78-
{+data-lake-short+} configuration by running the :ref:`storageValidateConfig
79-
<datalake-validatestorageconfig>` command.
66+
Replace ``<config>`` with the {+data-lake-short+} configuration. For
67+
complete documentation on the configuration fields and format, see
68+
:ref:`datalake-configuration-format`. You can validate your
69+
:ref:`configuration <datalake-configuration-file>` before setting or
70+
updating the {+dl+} configuration by running the
71+
:ref:`storageValidateConfig <datalake-validatestorageconfig>` command.
8072

8173
To set or update the storage configuration through the |service| UI:
8274

83-
1. Click :guilabel:`Configuration` for your {+dl+} to view the
84-
{+dl+} storage configuration.
75+
1. From the |service| UI, select :guilabel:`Data Lake` from the
76+
left-hand navigation.
77+
78+
#. Click :guilabel:`Configuration` for the {+data-lake-short+} that you
79+
want to update.
8580

8681
.. figure:: /images/set-update-config-ui.png
8782
:figwidth: 600px
8883
:alt: Image highlighting the Configuration button.
8984

90-
2. Make changes to your storage configuration and click :guilabel:`Save`.
85+
#. Make any necessary changes to the :ref:`storage configuration
86+
<datalake-configuration-file>`.
87+
88+
#. Click :guilabel:`Save` for the changes to take effect.
9189

9290
.. _datalake-validatestorageconfig:
9391

@@ -102,8 +100,8 @@ You can run the following command to validate your {+data-lake-short+}
102100
use admin
103101
db.runCommand( { "storageValidateConfig" : <config> } )
104102

105-
Replace ``<config>`` with the {+data-lake-short+} configuration. For complete
106-
documentation on the configuration fields and format, see
103+
Replace ``<config>`` with the {+data-lake-short+} configuration. For
104+
complete documentation on the configuration fields and format, see
107105
:ref:`datalake-configuration-format`.
108106

109107
The command returns the following if your {+dl+} configuration is valid:
@@ -159,15 +157,212 @@ configuration.
159157
</security-add-mongodb-users/#atlasAdmin>` role has the
160158
``storageSetConfig`` privilege by default.
161159

162-
To generate a {+data-lake-short+} configuration, connect to the {+data-lake-short+}
163-
and run the following database commands:
160+
To generate a {+data-lake-short+} configuration, connect to the
161+
{+dl+} and run the following database commands:
164162

165163
.. code-block:: javascript
166164

167165
use admin
168166
db.runCommand( { "storageGenerateConfig" : 1 } )
169167

170-
For complete documentation on the configuration fields and format, see :ref:`datalake-configuration-format`.
168+
For complete documentation on the configuration fields and format, see
169+
:ref:`datalake-configuration-format`.
170+
171+
.. _generate-wildcard-collections:
172+
173+
Generate Wildcard Collections
174+
-----------------------------
175+
176+
You can dynamically generate collection names that map to data in your
177+
|s3| bucket or |service| cluster. To dynamically generate collection
178+
names, specify the wildcard, ``*``, as the value for the collection
179+
name setting in your {+dl+} storage configuration. You can't
180+
dynamically generate collection names in your {+dl+} storage
181+
configuration that map to data in your |http| or |https| data store.
182+
183+
You can use the :ref:`storageSetConfig <datalake-setstorageconfig>`
184+
command to configure the settings for generating wildcard (``*``)
185+
collections.
186+
187+
To learn more about the configuration settings for generating wildcard
188+
collections, click on the tab for your data store:
189+
190+
.. tabs::
191+
192+
.. tab:: S3
193+
:tabid: s3
194+
195+
To generate wildcard collections in your {+dl+} storage
196+
configuration that map to data in your |s3| bucket, configure the
197+
following settings in your {+dl+} storage configuration:
198+
199+
- Specify ``*`` as the value for the
200+
:datalakeconf:`databases.[n].collections.name` setting.
201+
202+
- Specify the ``collectionName()`` function as the value for
203+
the :datalakeconf:`databases.[n].collections.[n].dataSources.[n].path`
204+
setting.
205+
206+
- *Optional*. Specify the maximum number of collections to
207+
include in the database in the
208+
:datalakeconf:`databases.[n].maxWildcardCollections` setting.
209+
By default, {+adl+} generates up to ``100`` wildcard
210+
collections in the database.
211+
212+
.. example::
213+
214+
.. code-block:: json
215+
:copyable: false
216+
:emphasize-lines: 6
217+
218+
"databases" : [
219+
{
220+
"name" : "<db-name>",
221+
"collections" : [
222+
{
223+
"name" : "*",
224+
"dataSources" : [
225+
{
226+
"storeName" : "<s3-store-name>",
227+
"path" : "{collectionName()}"
228+
}
229+
]
230+
}
231+
],
232+
"maxWildcardCollections" : <integer>,
233+
}
234+
]
235+
236+
You can also use the :ref:`dl-create-collection-views-cmd`
237+
administration command and the {+adl+} User Interface |json|
238+
Editor to configure the settings for generating wildcard
239+
collections. You can't use the {+adl+} User Interface Visual
240+
Editor to configure the settings for generating wildcard
241+
collections.
242+
243+
.. tab:: Atlas
244+
:tabid: atlas
245+
246+
For the |service| data store, you can generate the following
247+
wildcard collections and databases in your {+dl+} storage configuration:
248+
249+
- Wildcard collections for a specific database
250+
251+
- Wildcard databases with one wildcard collection
252+
253+
You can also dynamically generate collection names that match
254+
a regex pattern.
255+
256+
.. tabs::
257+
258+
.. tab:: Wildcard Collections
259+
:tabid: wildcardColls
260+
261+
To generate wildcard collections in your {+dl+} storage
262+
configuration that map to data in your |service| cluster,
263+
configure the following settings in your {+dl+} storage
264+
configuration:
265+
266+
- Specify ``*`` as the value for the
267+
:datalakeconf:`databases.[n].collections.name` setting.
268+
269+
- Omit the :datalakeconf:`databases.[n].collections.[n].dataSources.[n].collection`
270+
setting.
271+
272+
- *Optional*. Use the :datalakeconf:`databases.[n].collections.[n].dataSources.[n].collectionRegex`
273+
setting to generate wildcard collection names that match
274+
a regex pattern.
275+
276+
.. example::
277+
278+
.. code-block:: json
279+
:copyable: false
280+
:emphasize-lines: 6
281+
282+
"databases" : [
283+
{
284+
"name" : "<db-name>",
285+
"collections" : [
286+
{
287+
"name" : "*",
288+
"dataSources" : [
289+
{
290+
"storeName" : "<atlas-store-name>",
291+
"database" : "<atlas-db-name>",
292+
"collectionRegex" : "<regex-pattern>"
293+
}
294+
]
295+
}
296+
]
297+
}
298+
]
299+
300+
You can also use the :ref:`dl-create-collection-views-cmd`
301+
administration command and the {+adl+} User Interface to
302+
configure the settings for generating wildcard collections.
303+
304+
.. tab:: Wildcard Databases
305+
:tabid: wildcardDbs
306+
307+
To dynamically generate databases with one wildcard
308+
collection in your {+dl+} storage configuration, configure
309+
the following settings in your {+dl+} storage configuration:
310+
311+
- Specify ``*`` as the value for the
312+
:datalakeconf:`databases.[n].name` setting.
313+
314+
- Specify ``*`` as the value for the
315+
:datalakeconf:`databases.[n].collections.name` setting.
316+
317+
- Omit the :datalakeconf:`databases.[n].collections.[n].dataSources.[n].database`
318+
and :datalakeconf:`databases.[n].collections.[n].dataSources.[n].collection` settings.
319+
320+
- *Optional*. Use the :datalakeconf:`databases.[n].collections.[n].dataSources.[n].collectionRegex`
321+
setting to generate wildcard collection names that match
322+
a regex pattern.
323+
324+
.. example::
325+
326+
.. code-block:: json
327+
:copyable: false
328+
:emphasize-lines: 3,6
329+
330+
"databases" : [
331+
{
332+
"name" : "*",
333+
"collections" : [
334+
{
335+
"name" : "*",
336+
"dataSources" : [
337+
{
338+
"storeName" : "<atlas-store-name>",
339+
"collectionRegex" : "<regex-pattern>"
340+
}
341+
]
342+
}
343+
]
344+
}
345+
]
346+
347+
You can use the :ref:`dl-create-collection-views-cmd`
348+
administration command also to configure the settings for
349+
generating wildcard collection for wildcard databases. You
350+
can't use the {+adl+} User Interface to configure the
351+
settings for generating wildcard collection for wildcard
352+
databases.
353+
354+
Dynamically generated databases:
355+
356+
- Can exist alongside explicitly defined databases.
357+
However, {+adl+} won't include dynamically generated
358+
databases with names that conflict with databases that
359+
are explicitly defined in the storage configuration.
360+
- Can only be from a single |service| cluster. {+adl+}
361+
won't dynamically generate databases from multiple
362+
|service| clusters or other data stores.
363+
364+
To learn more about the configuration settings, see
365+
:ref:`datalake-configuration-file`.
171366

172367
.. toctree::
173368
:titlesonly:

0 commit comments

Comments
 (0)