Skip to content

Commit 8654dd9

Browse files
author
Andrew Leung
committed
added in SSD info
1 parent ff9b0b9 commit 8654dd9

File tree

1 file changed

+158
-10
lines changed

1 file changed

+158
-10
lines changed

draft/administration/production-notes.txt

Lines changed: 158 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -194,17 +194,15 @@ To turn off NUMA for MongoDB, use the ``numactl`` command and start
194194

195195
numactl --interleave=all /usr/bin/local/mongod
196196

197-
.. TODO is /usr/bin/local/mongod the default install location?
198-
199197
Adjust the ``proc`` settings using the following command:
200198

201199
.. code-block:: bash
202200

203201
echo 0 > /proc/sys/vm/zone_reclaim_mode
204202

205203
You can change ``zone_reclaim_mode`` without restarting mongod. For
206-
more information on this setting see:
207-
`http://www.kernel.org/doc/Documentation/sysctl/vm.txt`_.
204+
more information, see documentation on `Proc/sys/vm
205+
<http://www.kernel.org/doc/Documentation/sysctl/vm.txt>`_.
208206

209207
.. TODO the following is needed? or is just generally good reading material?
210208

@@ -288,7 +286,7 @@ If readahead is too high OR too low it can cause excessive page
288286
faulting and increased disk utilization.
289287

290288
The Best Readahead Value
291-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
289+
~~~~~~~~~~~~~~~~~~~~~~~~
292290

293291
What the right value of readahead is depends on your storage device,
294292
the size of your documents, and your access patterns.
@@ -399,9 +397,160 @@ suggestions:
399397
RAM may be more effective.
400398

401399
Solid State Disks
402-
~~~~~~~~~~~~~~~~~
400+
-----------------
401+
402+
Multiple MongoDB users have reported good success running MongoDB
403+
databases on solid state drives.
404+
405+
Write Endurance
406+
~~~~~~~~~~~~~~~
407+
408+
Write endurance with solid state drives vary. SLC drives have higher
409+
endurance but newer generation MLC (and eMLC) drives are getting
410+
better.
411+
412+
As an example, the MLC Intel 320 drives specify endurance of 20GB/day
413+
of writes for five years. If you are doing small or medium size random
414+
reads and writes this is sufficient. The Intel 710 series is the
415+
enterprise-class models and have higher endurance.
416+
417+
If you intend to write a full drive's worth of data writing per day
418+
(and every day for a long time), this level of endurance would be
419+
insufficient. For large sequential operations (for example very large
420+
map/reduces), one could write far more than 20GB/day. Traditional hard
421+
drives are quite good at sequential I/O and thus may be better for
422+
that use case.
423+
424+
.. seealso:: `SSD lifespan <http://maxschireson.com/2011/04/21/debunking-ssd-lifespan-and-random-write-performance-concerns/>`_
425+
426+
Reserve some unpartitioned space
427+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
428+
429+
Some users report good results when leaving 20% of their drives
430+
completely unpartitioned. In this situation the drive knows it can use
431+
that space as working space. Note formatted but empty space may or may
432+
not be available to the drive depending on TRIM support which is often
433+
lacking.
434+
435+
smartctl
436+
~~~~~~~~
437+
438+
On some devices, "smartctl -A" will show you the
439+
Media_Wearout_Indicator.
440+
441+
.. code-block:: bash
442+
443+
$ sudo smartctl -A /dev/sda | grep Wearout
444+
233 Media_Wearout_Indicator 0x0032 099 099 000 Old_age Always - 0
445+
446+
Speed
447+
~~~~~
448+
449+
A `paper <http://portal.acm.org/citation.cfm?id=1837922>`_ in ACM
450+
Transactions on Storage (Sep2010) listed the following results for
451+
measured 4KB peak random direct IO for some popular devices:
452+
453+
.. list-table:: SSD Read and Write Performance
454+
:header-rows: 1
455+
456+
* - Device
457+
- Read IOPS
458+
- Write IOPS
459+
* - Intel X25-E
460+
- 33,400
461+
- 3,120
462+
* - FusionIO ioDrive
463+
- 98,800
464+
- 75,100
465+
466+
Intel's larger drives seem to have higher write IOPS than the smaller
467+
ones (up to 23,000 claimed for the 320 series). More info here.
468+
469+
Real-world results should be lower, but the numbers are still impressive.
470+
471+
Reliability
472+
~~~~~~~~~~~
473+
474+
Some manufacturers specify relability stats indicating failure rates
475+
of approximately 0.6% per year. This is better than traditional drives
476+
(2% per year failure rate or higher), but still quite high and thus
477+
mirroring will be important. (And of course manufacture specs could be
478+
optimistic.)
479+
480+
Random reads vs. random writes
481+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
482+
483+
Random access I/O is the sweet spot for SSD. Historically random reads
484+
on SSD drives have been much faster than random writes. That said,
485+
random writes are still an order of magnitude faster than spinning
486+
disks.
487+
488+
Recently new drives have released that have much higher random write
489+
performance. For example the Intel 320 series, particular the larger
490+
capacity drives, has much higher random write performance than the
491+
older Intel X25 series drives.
492+
493+
PCI vs. SATA
494+
~~~~~~~~~~~~
495+
496+
SSD is available both as PCI cards and SATA drives. PCI is oriented
497+
towards the high end of products on the market.
498+
499+
Some SATA SSD drives now support 6Gbps sata transfer rates, yet at the
500+
time of this writing many controllers shipped with servers are
501+
3Gbps. For random IO oriented applications this is likely sufficient,
502+
but worth considering regardless.
503+
504+
RAM vs. SSD
505+
~~~~~~~~~~~
506+
507+
Even though SSDs are fast, RAM is still faster. Thus for the highest
508+
performance possible, having enough RAM to contain the working set of
509+
data from the database is optimal. However, it is common to have a
510+
request rate that is easily met by the speed of random IO's with SSDs,
511+
and SSD cost per byte is lower than RAM (and persistent too).
512+
513+
A system with less RAM and SSDs will likely outperform a system with
514+
more RAM and spinning disks. For example a system with SSD drives and
515+
64GB RAM will often outperform a system with 128GB RAM and spinning
516+
disks. (Results will vary by use case of course.)
403517

518+
.. TODO this is basically a 'soft page fault'
404519

520+
One helpful characteristic of SSDs is they can facilitate fast
521+
"preheat" of RAM on a hardware restart. On a restart a system's RAM
522+
file system cache must be repopulated. On a box with 64GB RAM or more,
523+
this can take a considerable amount of time – for example six minutes
524+
at 100MB/sec, and much longer when the requests are random IO to
525+
spinning disks.
526+
527+
FlashCache
528+
~~~~~~~~~~
529+
530+
FlashCache is a write back block cache for Linux. It was created by
531+
Facebook. Installation is a bit of work as you have to build and
532+
install a kernel module. Sep2011: If you use this please report
533+
results in the mongo forum as it's new and everyone will be curious
534+
how well it works.
535+
536+
http://www.facebook.com/note.php?note_id=388112370932
537+
538+
OS scheduler
539+
~~~~~~~~~~~~
540+
541+
One user reports good results with the noop IO scheduler under certain
542+
configurations of their system. As always caution is recommended on
543+
nonstandard configurations as such configurations never get as much
544+
testing...
545+
546+
Run mongoperf
547+
~~~~~~~~~~~~~
548+
549+
mongoperf is a disk performance stress utility. It is not part of the
550+
mongo database, simply a disk exercising program. We recommend testing
551+
your SSD setup with mongoperf. Note that the random writes it are a
552+
worst case scenario, and in many cases MongoDB can do writes that are
553+
much larger.
405554

406555
Redundant Array of Independent Disks
407556
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -420,9 +569,8 @@ See also the ec2 page for comments on EBS striping.
420569
Remote File Systems
421570
~~~~~~~~~~~~~~~~~~~
422571

423-
We have found that some versions of NFS perform very poorly and do not
424-
recommend using NFS. See the NFS page for more information.
425-
.. TODO link to NFS page
426-
427572
Amazon elastic block store (EBS) seems to work well up to its
428573
intrinsic performance characteristics, when configured well.
574+
575+
We have found that some versions of NFS perform very poorly and do not
576+
recommend using NFS with MongoDB.

0 commit comments

Comments
 (0)