Skip to content

Commit 9cdd153

Browse files
author
Andrea Lattuada
committed
Documentation for the currently experimental Storage-viz commands.
1 parent 707ae3e commit 9cdd153

File tree

1 file changed

+271
-0
lines changed

1 file changed

+271
-0
lines changed

draft/core/storage-viz.txt

Lines changed: 271 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,271 @@
1+
===========
2+
Storage-viz
3+
===========
4+
5+
"Storage-viz" is suite of tools that can be used to diagnose issues or
6+
assess proposed changes related to storage allocation strategy and index
7+
balancing heuristics.
8+
9+
The ``storageDetails`` command will aggregate statistics related to the
10+
storage layout (when invoked with ``analyze: "diskStorage"``) or the percentage
11+
of pages currently in RAM (when invoked with ``analyze: "pagesInRAM"``) for the
12+
specified collection, extent or part of extent.
13+
14+
The ``indexStats`` command provides detailed and aggregate information and
15+
statistics for the underlying btree of a particular index.
16+
Stats are aggregated for the entire tree, per-depth and, if requested through
17+
the ``expandNodes`` option, per-subtree.
18+
19+
Both commands take a global READ_LOCK and will page in all the extents or btree
20+
buckets encountered: this will have adverse effects on server performance.
21+
The commands should never be run on a primary and will cause a secondary to
22+
fall behind on replication. ``diskStorage`` when run with
23+
``analyze: "pagesInRAM"`` is the exception as it typically returns rapidly and
24+
may only page in extent headers.
25+
26+
.. default-domain:: mongodb
27+
28+
.. dbcommand:: storageDetails
29+
30+
The command can be slow, particularly on larger data sets.
31+
32+
.. code-block:: javascript
33+
34+
{ storageDetails: "collection_name",
35+
analyze: "diskStorage" | "pagesInRAM" }
36+
37+
This command will aggregate statistics related to the storage layout
38+
(when invoked with ``analyze: "diskStorage"``) or the percentage of pages
39+
currently in RAM (when invoked with ``analyze: "pagesInRAM"``) for the
40+
specified collection.
41+
You may also specify one of the following options:
42+
43+
- ``extent: 4`` (0-based) only processes the 5th extent of the collection
44+
45+
- ``range: [start, end]`` only processes the range between ``start`` bytes
46+
and ``end`` bytes from the start of the extent. Requires an ``extent`` to
47+
be specified.
48+
49+
- ``granularity: 1 << 20`` splits the extents in 20MB slices and
50+
reports statistics aggregated per-slice.
51+
52+
- ``numberOfSlices: 100`` splits the extent(s) in 100 slices and
53+
reports statistics aggregated per-slice.
54+
55+
``granularity`` and ``numberOfSlices`` are mutually exclusive.
56+
57+
- ``characteristicField: "dotted.path"`` specifies a field in the
58+
documents of the collection to be inspected and averaged to give
59+
an hint on what kind of documents belong to an extent or slice.
60+
Defaults to ``"_id"``. ObjectIDs, any number and Dates are
61+
supported. If the field has the wrong type in some documents
62+
it would be silently ignored.
63+
64+
- ``processDeletedRecords: false`` disables the analysis of deleted
65+
records which can be slow as it requires an iteration on all
66+
the deletedList bucket for each extent. Defaults to ``true``.
67+
68+
- ``showRecords: true`` outputs basic information for each document
69+
and deletedRecord encountered. It should only be enabled for small
70+
ranges on single extents. Produces large output which can exceed
71+
the maximum bson object size.
72+
73+
The typical output, when ``analyze: 'diskStorage'``, has the form:
74+
75+
.. code-block:: javascript
76+
77+
{ extentHeaderBytes: <size>,
78+
recordHeaderBytes: <size>,
79+
range: [startOfs, endOfs], // extent-relative
80+
numEntries: <number of records>,
81+
bsonBytes: <total size of the bson objects>,
82+
recBytes: <total size of the valid records>,
83+
onDiskBytes: <length of the extent or range>,
84+
(opt) characteristicCount: <number of records containing the field used to tell them apart>
85+
(opt) characteristicAvg: <average value of the characteristic field>
86+
outOfOrderRecs: <number of records that follow - in the record linked list -
87+
a record that is located further in the extent>
88+
(opt) freeRecsPerBucket: [ ... ],
89+
90+
The nth element in the ``freeRecsPerBucket`` array is the count of deleted records in the
91+
nth bucket of the deletedList.
92+
``characteristicCount`` and ``characteristicAvg`` are only present if some documents contain
93+
the field specified as ``characteristicField`` and it has a viable type (any number, ObjectID
94+
or Date).
95+
96+
The list of slices follows, with similar information aggregated per-slice:
97+
98+
.. code-block:: javascript
99+
100+
slices: [
101+
{ numEntries: <number of records>,
102+
...
103+
freeRecsPerBucket: [ ... ]
104+
},
105+
...
106+
]
107+
108+
If ``showRecords: true`` was set two additional fields are added to the outer document:
109+
110+
.. code-block:: javascript
111+
112+
records: [
113+
{ ofs: <record offset from start of extent>,
114+
recBytes: <record size>,
115+
bsonBytes: <bson document size>,
116+
(optional) characteristic: <value of the characteristic field>
117+
},
118+
... (one element per record)
119+
],
120+
(optional) deletedRecords: [
121+
{ ofs: <offset from start of extent>,
122+
recBytes: <deleted record size>
123+
},
124+
... (one element per deleted record)
125+
]
126+
127+
The typical output, when ``analyze: 'pagesInRAM'``, has the form:
128+
129+
{ pageBytes: <system page size>,
130+
onDiskBytes: <size of the extent>,
131+
inMem: <ratio of pages in memory for the entire extent>,
132+
(opt) slices: [ ... ] (only present if either params.granularity or numberOfSlices is not
133+
zero and there exist more than one slice for this extent)
134+
(opt) sliceBytes: <size of each slice>
135+
}
136+
137+
The :program:`mongo` shell also provides wrappers:
138+
139+
.. code-block:: javascript
140+
141+
db.collection.diskStorageStats();
142+
db.collection.pagesInRAM();
143+
144+
db.collection.getDiskStorageStats();
145+
db.collection.getPagesInRAM();
146+
147+
``diskStorageStats`` analyzes storage for the collection
148+
(equivalent to invoking the command with ``{analyze: "diskStorage"}``).
149+
150+
``pagesInRAM`` reports the percentage of pages in RAM for the collection
151+
(equivalent to invoking the command with ``{analyze: "pagesInRAM"}``).
152+
153+
``db.collection.getDiskStorageStats`` and ``db.collection.getPagesInRAM``
154+
take the same parameters as ``diskStorageStats`` and ``pagesInRAM``,
155+
respectively, and provide a human-readable representation of the output.
156+
157+
158+
.. warning:: This command is resource intensive and may have an
159+
impact on the performance of your MongoDB instance. It also requires
160+
the entire collection or extent to be loaded in RAM and it may
161+
end up evicting some of the pages from other collections or extents.
162+
163+
.. read-lock
164+
165+
.. dbcommand:: indexStats
166+
167+
The command can be slow, particularly on large indexes.
168+
169+
.. code-block:: javascript
170+
171+
{ indexStats: "collection_name",
172+
index: "index_name" }
173+
174+
This command provides detailed and aggregate information and statistics for the underlying
175+
btree for the index ``index_name`` in the collection ``collection_name``.
176+
Stats are aggregated for the entire tree, per-depth and, if requested through the ``expandNodes``
177+
option, per-subtree.
178+
179+
You can specify ``expandNodes: [0, 3]`` to expand the root (node 0 at depth 0) and the 4th child
180+
of root (node 3 at depth 1). The first element of the array should always be 0 otherwise no
181+
node will be expanded (there's only root ad depth 0). This will provide basic information about
182+
the expanded nodes and statistics for the subtrees rooted at the nodes themselves.
183+
184+
The typical output has the form:
185+
186+
.. code-block:: javascript
187+
188+
{ name: <index name>,
189+
version: <index version (0 or 1),
190+
isIdKey: <true if this is the default _id index>,
191+
keyPattern: <bson object describing the key pattern>,
192+
storageNs: <namespace of the index's underlying storage>,
193+
bucketBodyBytes: <bytes available for keynodes and bson objects in the bucket's body>,
194+
depth: <index depth (root excluded)>
195+
overall: { (statistics for the entire tree)
196+
numBuckets: <number of buckets (samples)>
197+
keyCount: { (stats about the number of keys in a bucket)
198+
count: <number of samples>,
199+
mean: <mean>
200+
(optional) stddev: <standard deviation>
201+
(optional) min: <minimum value (number of keys for the bucket that has the least)>
202+
(optional) max: <maximum value (number of keys for the bucket that has the most)>
203+
(optional) quantiles: {
204+
0.01: <1st percentile>, 0.02: ..., 0.09: ..., 0.25: <1st quartile>,
205+
0.5: <median>, 0.75: <3rd quartile>, 0.91: ..., 0.98: ..., 0.99: ...
206+
}
207+
(optional fields are only present if there are enough samples to compute sensible
208+
estimates)
209+
}
210+
usedKeyCount: <stats about the number of used keys in a bucket>
211+
(same structure as keyCount)
212+
bsonRatio: <stats about how much of the bucket body is occupied by bson objects>
213+
(same structure as keyCount)
214+
keyNodeRatio: <stats about how much of the bucket body is occupied by KeyNodes>
215+
(same structure as keyCount)
216+
fillRatio: <stats about how full is the bucket body (bson objects + KeyNodes)>
217+
(same structure as keyCount)
218+
},
219+
perLevel: [ (statistics aggregated per depth)
220+
(one element with the same structure as 'overall' for each btree level,
221+
the first refers to the root)
222+
]
223+
}
224+
225+
If 'expandNodes: [array]' was specified in the parameters, an additional field named
226+
'expandedNodes' is included in the output. It contains two nested arrays, such that the
227+
n-th element of the outer array contains stats for nodes at depth n (root is included) and
228+
the i-th element (0-based) of the inner array at depth n contains stats for the subtree
229+
rooted at the i-th child of the expanend node at depth (n - 1).
230+
Each element of the inner array has the same structure as 'overall' in the description above:
231+
it includes the aggregate stats for all the nodes in the subtree excluding the current
232+
bucket.
233+
It also contains an additional field 'nodeInfo' representing information for the current
234+
node:
235+
236+
.. code-block:: javascript
237+
238+
{ childNum: <i so that this is the (i + 1)-th child of the parent node>
239+
keyCount: <number of keys in this bucket>
240+
usedKeyCount: <number of non-empty KeyNodes>
241+
diskLoc: { (bson representation of the disk location for this bucket)
242+
file: <num>
243+
offset: <bytes>
244+
}
245+
depth: <depth of this bucket, root is at depth 0>
246+
fillRatio: <a value between 0 and 1 representing how full this bucket is>
247+
firstKey: <bson object containing the value for the first key>
248+
lastKey: <bson object containing the value for the last key>
249+
}
250+
251+
The :program:`mongo` shell also provides wrappers:
252+
253+
.. code-block:: javascript
254+
255+
db.collection.indexStats({index: "index_name"});
256+
db.collection.getIndexStats({index: "index_name"});
257+
258+
``db.collection.indexStats({index: "index_name"})`` is equivalent to running the command
259+
with {indexStats: "collection", index: "index_name"}.
260+
261+
``db.collection.getIndexStats`` takes the same parameters as ``indexStats`` and provides
262+
a human-readable summary of the output in the shell.
263+
264+
.. warning:: This command is resource intensive and may have an
265+
impact on the performance of your MongoDB instance. It also requires
266+
the entire collection or extent to be loaded in RAM and it may
267+
end up evicting some of the pages from other collections or extents.
268+
269+
.. read-lock
270+
271+

0 commit comments

Comments
 (0)