1
1
.. uses bulk.rst
2
2
3
+ .. _pymongo-bulk-write:
4
+
5
+ =====================
3
6
Bulk Write Operations
4
7
=====================
5
8
6
- .. code-block:: python
9
+ .. contents:: On this page
10
+ :local:
11
+ :backlinks: none
12
+ :depth: 2
13
+ :class: singlecol
7
14
8
- from pymongo import MongoClient
15
+ .. facet::
16
+ :name: genre
17
+ :values: reference
9
18
10
- client = MongoClient()
11
- client.drop_database("bulk_example")
19
+ .. meta::
20
+ :keywords: insert, update, replace, code example
12
21
13
- This tutorial explains how to take advantage of PyMongo 's bulk
22
+ This guide explains how to take advantage of {+driver-short+} 's bulk
14
23
write operation features. Executing write operations in batches
15
24
reduces the number of network round trips, increasing write
16
25
throughput.
@@ -20,165 +29,167 @@ Bulk Insert
20
29
21
30
.. versionadded:: 2.6
22
31
23
- A batch of documents can be inserted by passing a list to the
24
- the ``~pymongo.collection.Collection.insert_many`` method method. PyMongo
25
- will automatically split the batch into smaller sub-batches based on
26
- the maximum message size accepted by MongoDB, supporting very large
27
- bulk insert operations.
32
+ You can insert a batch of documents by passing a list to the
33
+ the ``~pymongo.collection.Collection.insert_many`` method. {+driver-short+}
34
+ supports large bulk insert operations by splitting the batch into smaller
35
+ sub-batches based on the maximum message size accepted by MongoDB.
36
+
37
+ The following example bulk inserts 10000 documents into a collection:
28
38
29
39
.. code-block:: python
30
40
31
- >>> import pymongo
32
- >>> db = pymongo.MongoClient().bulk_example
33
- >>> db.test.insert_many([{"i": i} for i in range(10000)]).inserted_ids
34
- [...]
35
- >>> db.test.count_documents({})
36
- 10000
41
+ >>> import pymongo
42
+ >>> db = pymongo.MongoClient().bulk_example
43
+ >>> db.test.insert_many([{"i": i} for i in range(10000)]).inserted_ids
44
+ [...]
45
+ >>> db.test.count_documents({})
46
+ 10000
37
47
38
48
Mixed Bulk Write Operations
39
49
---------------------------
40
50
41
51
.. versionadded:: 2.7
42
52
43
- PyMongo also supports executing mixed bulk write operations. A batch
44
- of insert, update, and remove operations can be executed together using
45
- the bulk write operations API.
53
+ {+driver-short+} supports executing mixed bulk write operations. You can run
54
+ a batch of insert, update, and remove operations together by using the bulk
55
+ write operations API.
46
56
47
57
.. _ordered_bulk:
48
58
49
59
Ordered Bulk Write Operations
50
- .............................
60
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
51
61
52
- Ordered bulk write operations are batched and sent to the server in the
62
+ {+driver-short+} batches and sends ordered bulk write operations to the server in the
53
63
order provided for serial execution. The return value is an instance of
54
- ``~pymongo.results.BulkWriteResult`` describing the type and count
64
+ ``~pymongo.results.BulkWriteResult``, which describes the type and count
55
65
of operations performed.
56
66
57
67
.. code-block:: python
58
68
59
- >>> from pprint import pprint
60
- >>> from pymongo import InsertOne, DeleteMany, ReplaceOne, UpdateOne
61
- >>> result = db.test.bulk_write(
62
- ... [
63
- ... DeleteMany({}), # Remove all documents from the previous example.
64
- ... InsertOne({"_id": 1}),
65
- ... InsertOne({"_id": 2}),
66
- ... InsertOne({"_id": 3}),
67
- ... UpdateOne({"_id": 1}, {"$set": {"foo": "bar"}}),
68
- ... UpdateOne({"_id": 4}, {"$inc": {"j": 1}}, upsert=True),
69
- ... ReplaceOne({"j": 1}, {"j": 2}),
70
- ... ]
71
- ... )
72
- >>> pprint(result.bulk_api_result)
73
- {'nInserted': 3,
74
- 'nMatched': 2,
75
- 'nModified': 2,
76
- 'nRemoved': 10000,
77
- 'nUpserted': 1,
78
- 'upserted': [{'_id': 4, 'index': 5}],
79
- 'writeConcernErrors': [],
80
- 'writeErrors': []}
81
-
82
- The first write failure that occurs (e.g. duplicate key error) aborts the
83
- remaining operations, and PyMongo raises
84
- ``~pymongo.errors.BulkWriteError``. The ``details`` attribute of
69
+ >>> from pprint import pprint
70
+ >>> from pymongo import InsertOne, DeleteMany, ReplaceOne, UpdateOne
71
+ >>> result = db.test.bulk_write(
72
+ ... [
73
+ ... DeleteMany({}), # Remove all documents from the previous example.
74
+ ... InsertOne({"_id": 1}),
75
+ ... InsertOne({"_id": 2}),
76
+ ... InsertOne({"_id": 3}),
77
+ ... UpdateOne({"_id": 1}, {"$set": {"foo": "bar"}}),
78
+ ... UpdateOne({"_id": 4}, {"$inc": {"j": 1}}, upsert=True),
79
+ ... ReplaceOne({"j": 1}, {"j": 2}),
80
+ ... ]
81
+ ... )
82
+ >>> pprint(result.bulk_api_result)
83
+ {'nInserted': 3,
84
+ 'nMatched': 2,
85
+ 'nModified': 2,
86
+ 'nRemoved': 10000,
87
+ 'nUpserted': 1,
88
+ 'upserted': [{'_id': 4, 'index': 5}],
89
+ 'writeConcernErrors': [],
90
+ 'writeErrors': []}
91
+
92
+ The first write failure that occurs, such as a duplicate key error, aborts the
93
+ remaining operations and raises a ``~pymongo.errors.BulkWriteError``. The ``details`` attribute of
85
94
the exception instance provides the execution results up until the failure
86
- occurred and details about the failure - including the operation that caused
95
+ occurred, and details about the failure, including the operation that caused
87
96
the failure.
88
97
98
+ The following example shows a bulk write operation that raises a duplicate key error:
99
+
89
100
.. code-block:: python
90
101
91
- >>> from pymongo import InsertOne, DeleteOne, ReplaceOne
92
- >>> from pymongo.errors import BulkWriteError
93
- >>> requests = [
94
- ... ReplaceOne({"j": 2}, {"i": 5}),
95
- ... InsertOne({"_id": 4}), # Violates the unique key constraint on _id.
96
- ... DeleteOne({"i": 5}),
97
- ... ]
98
- >>> try:
99
- ... db.test.bulk_write(requests)
100
- ... except BulkWriteError as bwe:
101
- ... pprint(bwe.details)
102
- ...
103
- {'nInserted': 0,
104
- 'nMatched': 1,
105
- 'nModified': 1,
106
- 'nRemoved': 0,
107
- 'nUpserted': 0,
108
- 'upserted': [],
109
- 'writeConcernErrors': [],
110
- 'writeErrors': [{'code': 11000,
111
- 'errmsg': '...E11000...duplicate key error...',
112
- 'index': 1,...
113
- 'op': {'_id': 4}}]}
102
+ >>> from pymongo import InsertOne, DeleteOne, ReplaceOne
103
+ >>> from pymongo.errors import BulkWriteError
104
+ >>> requests = [
105
+ ... ReplaceOne({"j": 2}, {"i": 5}),
106
+ ... InsertOne({"_id": 4}), # Violates the unique key constraint on _id.
107
+ ... DeleteOne({"i": 5}),
108
+ ... ]
109
+ >>> try:
110
+ ... db.test.bulk_write(requests)
111
+ ... except BulkWriteError as bwe:
112
+ ... pprint(bwe.details)
113
+ ...
114
+ {'nInserted': 0,
115
+ 'nMatched': 1,
116
+ 'nModified': 1,
117
+ 'nRemoved': 0,
118
+ 'nUpserted': 0,
119
+ 'upserted': [],
120
+ 'writeConcernErrors': [],
121
+ 'writeErrors': [{'code': 11000,
122
+ 'errmsg': '...E11000...duplicate key error...',
123
+ 'index': 1,...
124
+ 'op': {'_id': 4}}]}
114
125
115
126
.. _unordered_bulk:
116
127
117
128
Unordered Bulk Write Operations
118
- ...............................
129
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
119
130
120
- Unordered bulk write operations are batched and sent to the server in
121
- ** arbitrary order** where they may be executed in parallel. Any errors
122
- that occur are reported after all operations are attempted .
131
+ {+driver-short+} batches and sends unordered bulk write operations to the server in
132
+ arbitrary order, which means they might run in parallel. The driver reports
133
+ any errors that occur after attempting all operations.
123
134
124
- In the next example the first and third operations fail due to the unique
125
- constraint on _id. Since we are doing unordered execution the second
135
+ In the following example, the first and third operations raise an error because of the unique
136
+ constraint on `` _id``. Because the operation is unordered, only the second
126
137
and fourth operations succeed.
127
138
128
139
.. code-block:: python
129
140
130
- >>> requests = [
131
- ... InsertOne({"_id": 1}),
132
- ... DeleteOne({"_id": 2}),
133
- ... InsertOne({"_id": 3}),
134
- ... ReplaceOne({"_id": 4}, {"i": 1}),
135
- ... ]
136
- >>> try:
137
- ... db.test.bulk_write(requests, ordered=False)
138
- ... except BulkWriteError as bwe:
139
- ... pprint(bwe.details)
140
- ...
141
- {'nInserted': 0,
142
- 'nMatched': 1,
143
- 'nModified': 1,
144
- 'nRemoved': 1,
145
- 'nUpserted': 0,
146
- 'upserted': [],
147
- 'writeConcernErrors': [],
148
- 'writeErrors': [{'code': 11000,
149
- 'errmsg': '...E11000...duplicate key error...',
150
- 'index': 0,...
151
- 'op': {'_id': 1}},
152
- {'code': 11000,
153
- 'errmsg': '...',
154
- 'index': 2,...
155
- 'op': {'_id': 3}}]}
141
+ >>> requests = [
142
+ ... InsertOne({"_id": 1}),
143
+ ... DeleteOne({"_id": 2}),
144
+ ... InsertOne({"_id": 3}),
145
+ ... ReplaceOne({"_id": 4}, {"i": 1}),
146
+ ... ]
147
+ >>> try:
148
+ ... db.test.bulk_write(requests, ordered=False)
149
+ ... except BulkWriteError as bwe:
150
+ ... pprint(bwe.details)
151
+ ...
152
+ {'nInserted': 0,
153
+ 'nMatched': 1,
154
+ 'nModified': 1,
155
+ 'nRemoved': 1,
156
+ 'nUpserted': 0,
157
+ 'upserted': [],
158
+ 'writeConcernErrors': [],
159
+ 'writeErrors': [{'code': 11000,
160
+ 'errmsg': '...E11000...duplicate key error...',
161
+ 'index': 0,...
162
+ 'op': {'_id': 1}},
163
+ {'code': 11000,
164
+ 'errmsg': '...',
165
+ 'index': 2,...
166
+ 'op': {'_id': 3}}]}
156
167
157
168
Write Concern
158
- .............
169
+ -------------
159
170
160
- Bulk operations are executed with the
161
- ``~pymongo. collection.Collection.write_concern`` of the collection they
162
- are executed against. Write concern errors (e.g. wtimeout) will be reported
163
- after all operations are attempted , regardless of execution order.
171
+ When {+driver-short+} runs a bulk operation, it uses the``write_concern`` of the
172
+ collection in which the operation is running. The
173
+ driver reports all write concern errors, such as ``wtimeout``,
174
+ after attempting all of the operations , regardless of execution order.
164
175
165
176
.. code-block:: python
166
177
167
- >>> from pymongo import WriteConcern
168
- >>> coll = db.get_collection(
169
- ... 'test', write_concern=WriteConcern(w=3, wtimeout=1))
170
- >>> try:
171
- ... coll.bulk_write([InsertOne({'a': i}) for i in range(4)])
172
- ... except BulkWriteError as bwe:
173
- ... pprint(bwe.details)
174
- ...
175
- {'nInserted': 4,
176
- 'nMatched': 0,
177
- 'nModified': 0,
178
- 'nRemoved': 0,
179
- 'nUpserted': 0,
180
- 'upserted': [],
181
- 'writeConcernErrors': [{'code': 64...
182
- 'errInfo': {'wtimeout': True},
183
- 'errmsg': 'waiting for replication timed out'}],
184
- 'writeErrors': []}
178
+ >>> from pymongo import WriteConcern
179
+ >>> coll = db.get_collection(
180
+ ... 'test', write_concern=WriteConcern(w=3, wtimeout=1))
181
+ >>> try:
182
+ ... coll.bulk_write([InsertOne({'a': i}) for i in range(4)])
183
+ ... except BulkWriteError as bwe:
184
+ ... pprint(bwe.details)
185
+ ...
186
+ {'nInserted': 4,
187
+ 'nMatched': 0,
188
+ 'nModified': 0,
189
+ 'nRemoved': 0,
190
+ 'nUpserted': 0,
191
+ 'upserted': [],
192
+ 'writeConcernErrors': [{'code': 64...
193
+ 'errInfo': {'wtimeout': True},
194
+ 'errmsg': 'waiting for replication timed out'}],
195
+ 'writeErrors': []}
0 commit comments