Skip to content

Commit cf9c7e6

Browse files
DOCSP-36988 Bulk Write (#15)
1 parent ace08f7 commit cf9c7e6

File tree

1 file changed

+138
-127
lines changed

1 file changed

+138
-127
lines changed
Lines changed: 138 additions & 127 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,25 @@
11
.. uses bulk.rst
22

3+
.. _pymongo-bulk-write:
4+
5+
=====================
36
Bulk Write Operations
47
=====================
58

6-
.. code-block:: python
9+
.. contents:: On this page
10+
:local:
11+
:backlinks: none
12+
:depth: 2
13+
:class: singlecol
714

8-
from pymongo import MongoClient
15+
.. facet::
16+
:name: genre
17+
:values: reference
918

10-
client = MongoClient()
11-
client.drop_database("bulk_example")
19+
.. meta::
20+
:keywords: insert, update, replace, code example
1221

13-
This tutorial explains how to take advantage of PyMongo's bulk
22+
This guide explains how to take advantage of {+driver-short+}'s bulk
1423
write operation features. Executing write operations in batches
1524
reduces the number of network round trips, increasing write
1625
throughput.
@@ -20,165 +29,167 @@ Bulk Insert
2029

2130
.. versionadded:: 2.6
2231

23-
A batch of documents can be inserted by passing a list to the
24-
the ``~pymongo.collection.Collection.insert_many`` method method. PyMongo
25-
will automatically split the batch into smaller sub-batches based on
26-
the maximum message size accepted by MongoDB, supporting very large
27-
bulk insert operations.
32+
You can insert a batch of documents by passing a list to the
33+
the ``~pymongo.collection.Collection.insert_many`` method. {+driver-short+}
34+
supports large bulk insert operations by splitting the batch into smaller
35+
sub-batches based on the maximum message size accepted by MongoDB.
36+
37+
The following example bulk inserts 10000 documents into a collection:
2838

2939
.. code-block:: python
3040

31-
>>> import pymongo
32-
>>> db = pymongo.MongoClient().bulk_example
33-
>>> db.test.insert_many([{"i": i} for i in range(10000)]).inserted_ids
34-
[...]
35-
>>> db.test.count_documents({})
36-
10000
41+
>>> import pymongo
42+
>>> db = pymongo.MongoClient().bulk_example
43+
>>> db.test.insert_many([{"i": i} for i in range(10000)]).inserted_ids
44+
[...]
45+
>>> db.test.count_documents({})
46+
10000
3747

3848
Mixed Bulk Write Operations
3949
---------------------------
4050

4151
.. versionadded:: 2.7
4252

43-
PyMongo also supports executing mixed bulk write operations. A batch
44-
of insert, update, and remove operations can be executed together using
45-
the bulk write operations API.
53+
{+driver-short+} supports executing mixed bulk write operations. You can run
54+
a batch of insert, update, and remove operations together by using the bulk
55+
write operations API.
4656

4757
.. _ordered_bulk:
4858

4959
Ordered Bulk Write Operations
50-
.............................
60+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
5161

52-
Ordered bulk write operations are batched and sent to the server in the
62+
{+driver-short+} batches and sends ordered bulk write operations to the server in the
5363
order provided for serial execution. The return value is an instance of
54-
``~pymongo.results.BulkWriteResult`` describing the type and count
64+
``~pymongo.results.BulkWriteResult``, which describes the type and count
5565
of operations performed.
5666

5767
.. code-block:: python
5868

59-
>>> from pprint import pprint
60-
>>> from pymongo import InsertOne, DeleteMany, ReplaceOne, UpdateOne
61-
>>> result = db.test.bulk_write(
62-
... [
63-
... DeleteMany({}), # Remove all documents from the previous example.
64-
... InsertOne({"_id": 1}),
65-
... InsertOne({"_id": 2}),
66-
... InsertOne({"_id": 3}),
67-
... UpdateOne({"_id": 1}, {"$set": {"foo": "bar"}}),
68-
... UpdateOne({"_id": 4}, {"$inc": {"j": 1}}, upsert=True),
69-
... ReplaceOne({"j": 1}, {"j": 2}),
70-
... ]
71-
... )
72-
>>> pprint(result.bulk_api_result)
73-
{'nInserted': 3,
74-
'nMatched': 2,
75-
'nModified': 2,
76-
'nRemoved': 10000,
77-
'nUpserted': 1,
78-
'upserted': [{'_id': 4, 'index': 5}],
79-
'writeConcernErrors': [],
80-
'writeErrors': []}
81-
82-
The first write failure that occurs (e.g. duplicate key error) aborts the
83-
remaining operations, and PyMongo raises
84-
``~pymongo.errors.BulkWriteError``. The ``details`` attribute of
69+
>>> from pprint import pprint
70+
>>> from pymongo import InsertOne, DeleteMany, ReplaceOne, UpdateOne
71+
>>> result = db.test.bulk_write(
72+
... [
73+
... DeleteMany({}), # Remove all documents from the previous example.
74+
... InsertOne({"_id": 1}),
75+
... InsertOne({"_id": 2}),
76+
... InsertOne({"_id": 3}),
77+
... UpdateOne({"_id": 1}, {"$set": {"foo": "bar"}}),
78+
... UpdateOne({"_id": 4}, {"$inc": {"j": 1}}, upsert=True),
79+
... ReplaceOne({"j": 1}, {"j": 2}),
80+
... ]
81+
... )
82+
>>> pprint(result.bulk_api_result)
83+
{'nInserted': 3,
84+
'nMatched': 2,
85+
'nModified': 2,
86+
'nRemoved': 10000,
87+
'nUpserted': 1,
88+
'upserted': [{'_id': 4, 'index': 5}],
89+
'writeConcernErrors': [],
90+
'writeErrors': []}
91+
92+
The first write failure that occurs, such as a duplicate key error, aborts the
93+
remaining operations and raises a ``~pymongo.errors.BulkWriteError``. The ``details`` attribute of
8594
the exception instance provides the execution results up until the failure
86-
occurred and details about the failure - including the operation that caused
95+
occurred, and details about the failure, including the operation that caused
8796
the failure.
8897

98+
The following example shows a bulk write operation that raises a duplicate key error:
99+
89100
.. code-block:: python
90101

91-
>>> from pymongo import InsertOne, DeleteOne, ReplaceOne
92-
>>> from pymongo.errors import BulkWriteError
93-
>>> requests = [
94-
... ReplaceOne({"j": 2}, {"i": 5}),
95-
... InsertOne({"_id": 4}), # Violates the unique key constraint on _id.
96-
... DeleteOne({"i": 5}),
97-
... ]
98-
>>> try:
99-
... db.test.bulk_write(requests)
100-
... except BulkWriteError as bwe:
101-
... pprint(bwe.details)
102-
...
103-
{'nInserted': 0,
104-
'nMatched': 1,
105-
'nModified': 1,
106-
'nRemoved': 0,
107-
'nUpserted': 0,
108-
'upserted': [],
109-
'writeConcernErrors': [],
110-
'writeErrors': [{'code': 11000,
111-
'errmsg': '...E11000...duplicate key error...',
112-
'index': 1,...
113-
'op': {'_id': 4}}]}
102+
>>> from pymongo import InsertOne, DeleteOne, ReplaceOne
103+
>>> from pymongo.errors import BulkWriteError
104+
>>> requests = [
105+
... ReplaceOne({"j": 2}, {"i": 5}),
106+
... InsertOne({"_id": 4}), # Violates the unique key constraint on _id.
107+
... DeleteOne({"i": 5}),
108+
... ]
109+
>>> try:
110+
... db.test.bulk_write(requests)
111+
... except BulkWriteError as bwe:
112+
... pprint(bwe.details)
113+
...
114+
{'nInserted': 0,
115+
'nMatched': 1,
116+
'nModified': 1,
117+
'nRemoved': 0,
118+
'nUpserted': 0,
119+
'upserted': [],
120+
'writeConcernErrors': [],
121+
'writeErrors': [{'code': 11000,
122+
'errmsg': '...E11000...duplicate key error...',
123+
'index': 1,...
124+
'op': {'_id': 4}}]}
114125

115126
.. _unordered_bulk:
116127

117128
Unordered Bulk Write Operations
118-
...............................
129+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
119130

120-
Unordered bulk write operations are batched and sent to the server in
121-
**arbitrary order** where they may be executed in parallel. Any errors
122-
that occur are reported after all operations are attempted.
131+
{+driver-short+} batches and sends unordered bulk write operations to the server in
132+
arbitrary order, which means they might run in parallel. The driver reports
133+
any errors that occur after attempting all operations.
123134

124-
In the next example the first and third operations fail due to the unique
125-
constraint on _id. Since we are doing unordered execution the second
135+
In the following example, the first and third operations raise an error because of the unique
136+
constraint on ``_id``. Because the operation is unordered, only the second
126137
and fourth operations succeed.
127138

128139
.. code-block:: python
129140

130-
>>> requests = [
131-
... InsertOne({"_id": 1}),
132-
... DeleteOne({"_id": 2}),
133-
... InsertOne({"_id": 3}),
134-
... ReplaceOne({"_id": 4}, {"i": 1}),
135-
... ]
136-
>>> try:
137-
... db.test.bulk_write(requests, ordered=False)
138-
... except BulkWriteError as bwe:
139-
... pprint(bwe.details)
140-
...
141-
{'nInserted': 0,
142-
'nMatched': 1,
143-
'nModified': 1,
144-
'nRemoved': 1,
145-
'nUpserted': 0,
146-
'upserted': [],
147-
'writeConcernErrors': [],
148-
'writeErrors': [{'code': 11000,
149-
'errmsg': '...E11000...duplicate key error...',
150-
'index': 0,...
151-
'op': {'_id': 1}},
152-
{'code': 11000,
153-
'errmsg': '...',
154-
'index': 2,...
155-
'op': {'_id': 3}}]}
141+
>>> requests = [
142+
... InsertOne({"_id": 1}),
143+
... DeleteOne({"_id": 2}),
144+
... InsertOne({"_id": 3}),
145+
... ReplaceOne({"_id": 4}, {"i": 1}),
146+
... ]
147+
>>> try:
148+
... db.test.bulk_write(requests, ordered=False)
149+
... except BulkWriteError as bwe:
150+
... pprint(bwe.details)
151+
...
152+
{'nInserted': 0,
153+
'nMatched': 1,
154+
'nModified': 1,
155+
'nRemoved': 1,
156+
'nUpserted': 0,
157+
'upserted': [],
158+
'writeConcernErrors': [],
159+
'writeErrors': [{'code': 11000,
160+
'errmsg': '...E11000...duplicate key error...',
161+
'index': 0,...
162+
'op': {'_id': 1}},
163+
{'code': 11000,
164+
'errmsg': '...',
165+
'index': 2,...
166+
'op': {'_id': 3}}]}
156167

157168
Write Concern
158-
.............
169+
-------------
159170

160-
Bulk operations are executed with the
161-
``~pymongo.collection.Collection.write_concern`` of the collection they
162-
are executed against. Write concern errors (e.g. wtimeout) will be reported
163-
after all operations are attempted, regardless of execution order.
171+
When {+driver-short+} runs a bulk operation, it uses the``write_concern`` of the
172+
collection in which the operation is running. The
173+
driver reports all write concern errors, such as ``wtimeout``,
174+
after attempting all of the operations, regardless of execution order.
164175

165176
.. code-block:: python
166177

167-
>>> from pymongo import WriteConcern
168-
>>> coll = db.get_collection(
169-
... 'test', write_concern=WriteConcern(w=3, wtimeout=1))
170-
>>> try:
171-
... coll.bulk_write([InsertOne({'a': i}) for i in range(4)])
172-
... except BulkWriteError as bwe:
173-
... pprint(bwe.details)
174-
...
175-
{'nInserted': 4,
176-
'nMatched': 0,
177-
'nModified': 0,
178-
'nRemoved': 0,
179-
'nUpserted': 0,
180-
'upserted': [],
181-
'writeConcernErrors': [{'code': 64...
182-
'errInfo': {'wtimeout': True},
183-
'errmsg': 'waiting for replication timed out'}],
184-
'writeErrors': []}
178+
>>> from pymongo import WriteConcern
179+
>>> coll = db.get_collection(
180+
... 'test', write_concern=WriteConcern(w=3, wtimeout=1))
181+
>>> try:
182+
... coll.bulk_write([InsertOne({'a': i}) for i in range(4)])
183+
... except BulkWriteError as bwe:
184+
... pprint(bwe.details)
185+
...
186+
{'nInserted': 4,
187+
'nMatched': 0,
188+
'nModified': 0,
189+
'nRemoved': 0,
190+
'nUpserted': 0,
191+
'upserted': [],
192+
'writeConcernErrors': [{'code': 64...
193+
'errInfo': {'wtimeout': True},
194+
'errmsg': 'waiting for replication timed out'}],
195+
'writeErrors': []}

0 commit comments

Comments
 (0)