@@ -194,17 +194,15 @@ To turn off NUMA for MongoDB, use the ``numactl`` command and start
194
194
195
195
numactl --interleave=all /usr/bin/local/mongod
196
196
197
- .. TODO is /usr/bin/local/mongod the default install location?
198
-
199
197
Adjust the ``proc`` settings using the following command:
200
198
201
199
.. code-block:: bash
202
200
203
201
echo 0 > /proc/sys/vm/zone_reclaim_mode
204
202
205
203
You can change ``zone_reclaim_mode`` without restarting mongod. For
206
- more information on this setting see:
207
- ` http://www.kernel.org/doc/Documentation/sysctl/vm.txt`_.
204
+ more information, see documentation on `Proc/sys/vm
205
+ < http://www.kernel.org/doc/Documentation/sysctl/vm.txt> `_.
208
206
209
207
.. TODO the following is needed? or is just generally good reading material?
210
208
@@ -288,7 +286,7 @@ If readahead is too high OR too low it can cause excessive page
288
286
faulting and increased disk utilization.
289
287
290
288
The Best Readahead Value
291
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
289
+ ~~~~~~~~~~~~~~~~~~~~~~~~
292
290
293
291
What the right value of readahead is depends on your storage device,
294
292
the size of your documents, and your access patterns.
@@ -399,9 +397,160 @@ suggestions:
399
397
RAM may be more effective.
400
398
401
399
Solid State Disks
402
- ~~~~~~~~~~~~~~~~~
400
+ -----------------
401
+
402
+ Multiple MongoDB users have reported good success running MongoDB
403
+ databases on solid state drives.
404
+
405
+ Write Endurance
406
+ ~~~~~~~~~~~~~~~
407
+
408
+ Write endurance with solid state drives vary. SLC drives have higher
409
+ endurance but newer generation MLC (and eMLC) drives are getting
410
+ better.
411
+
412
+ As an example, the MLC Intel 320 drives specify endurance of 20GB/day
413
+ of writes for five years. If you are doing small or medium size random
414
+ reads and writes this is sufficient. The Intel 710 series is the
415
+ enterprise-class models and have higher endurance.
416
+
417
+ If you intend to write a full drive's worth of data writing per day
418
+ (and every day for a long time), this level of endurance would be
419
+ insufficient. For large sequential operations (for example very large
420
+ map/reduces), one could write far more than 20GB/day. Traditional hard
421
+ drives are quite good at sequential I/O and thus may be better for
422
+ that use case.
423
+
424
+ .. seealso:: `SSD lifespan <http://maxschireson.com/2011/04/21/debunking-ssd-lifespan-and-random-write-performance-concerns/>`_
425
+
426
+ Reserve some unpartitioned space
427
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
428
+
429
+ Some users report good results when leaving 20% of their drives
430
+ completely unpartitioned. In this situation the drive knows it can use
431
+ that space as working space. Note formatted but empty space may or may
432
+ not be available to the drive depending on TRIM support which is often
433
+ lacking.
434
+
435
+ smartctl
436
+ ~~~~~~~~
437
+
438
+ On some devices, "smartctl -A" will show you the
439
+ Media_Wearout_Indicator.
440
+
441
+ .. code-block:: bash
442
+
443
+ $ sudo smartctl -A /dev/sda | grep Wearout
444
+ 233 Media_Wearout_Indicator 0x0032 099 099 000 Old_age Always - 0
445
+
446
+ Speed
447
+ ~~~~~
448
+
449
+ A `paper <http://portal.acm.org/citation.cfm?id=1837922>`_ in ACM
450
+ Transactions on Storage (Sep2010) listed the following results for
451
+ measured 4KB peak random direct IO for some popular devices:
452
+
453
+ .. list-table:: SSD Read and Write Performance
454
+ :header-rows: 1
455
+
456
+ * - Device
457
+ - Read IOPS
458
+ - Write IOPS
459
+ * - Intel X25-E
460
+ - 33,400
461
+ - 3,120
462
+ * - FusionIO ioDrive
463
+ - 98,800
464
+ - 75,100
465
+
466
+ Intel's larger drives seem to have higher write IOPS than the smaller
467
+ ones (up to 23,000 claimed for the 320 series). More info here.
468
+
469
+ Real-world results should be lower, but the numbers are still impressive.
470
+
471
+ Reliability
472
+ ~~~~~~~~~~~
473
+
474
+ Some manufacturers specify relability stats indicating failure rates
475
+ of approximately 0.6% per year. This is better than traditional drives
476
+ (2% per year failure rate or higher), but still quite high and thus
477
+ mirroring will be important. (And of course manufacture specs could be
478
+ optimistic.)
479
+
480
+ Random reads vs. random writes
481
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
482
+
483
+ Random access I/O is the sweet spot for SSD. Historically random reads
484
+ on SSD drives have been much faster than random writes. That said,
485
+ random writes are still an order of magnitude faster than spinning
486
+ disks.
487
+
488
+ Recently new drives have released that have much higher random write
489
+ performance. For example the Intel 320 series, particular the larger
490
+ capacity drives, has much higher random write performance than the
491
+ older Intel X25 series drives.
492
+
493
+ PCI vs. SATA
494
+ ~~~~~~~~~~~~
495
+
496
+ SSD is available both as PCI cards and SATA drives. PCI is oriented
497
+ towards the high end of products on the market.
498
+
499
+ Some SATA SSD drives now support 6Gbps sata transfer rates, yet at the
500
+ time of this writing many controllers shipped with servers are
501
+ 3Gbps. For random IO oriented applications this is likely sufficient,
502
+ but worth considering regardless.
503
+
504
+ RAM vs. SSD
505
+ ~~~~~~~~~~~
506
+
507
+ Even though SSDs are fast, RAM is still faster. Thus for the highest
508
+ performance possible, having enough RAM to contain the working set of
509
+ data from the database is optimal. However, it is common to have a
510
+ request rate that is easily met by the speed of random IO's with SSDs,
511
+ and SSD cost per byte is lower than RAM (and persistent too).
512
+
513
+ A system with less RAM and SSDs will likely outperform a system with
514
+ more RAM and spinning disks. For example a system with SSD drives and
515
+ 64GB RAM will often outperform a system with 128GB RAM and spinning
516
+ disks. (Results will vary by use case of course.)
403
517
518
+ .. TODO this is basically a 'soft page fault'
404
519
520
+ One helpful characteristic of SSDs is they can facilitate fast
521
+ "preheat" of RAM on a hardware restart. On a restart a system's RAM
522
+ file system cache must be repopulated. On a box with 64GB RAM or more,
523
+ this can take a considerable amount of time – for example six minutes
524
+ at 100MB/sec, and much longer when the requests are random IO to
525
+ spinning disks.
526
+
527
+ FlashCache
528
+ ~~~~~~~~~~
529
+
530
+ FlashCache is a write back block cache for Linux. It was created by
531
+ Facebook. Installation is a bit of work as you have to build and
532
+ install a kernel module. Sep2011: If you use this please report
533
+ results in the mongo forum as it's new and everyone will be curious
534
+ how well it works.
535
+
536
+ http://www.facebook.com/note.php?note_id=388112370932
537
+
538
+ OS scheduler
539
+ ~~~~~~~~~~~~
540
+
541
+ One user reports good results with the noop IO scheduler under certain
542
+ configurations of their system. As always caution is recommended on
543
+ nonstandard configurations as such configurations never get as much
544
+ testing...
545
+
546
+ Run mongoperf
547
+ ~~~~~~~~~~~~~
548
+
549
+ mongoperf is a disk performance stress utility. It is not part of the
550
+ mongo database, simply a disk exercising program. We recommend testing
551
+ your SSD setup with mongoperf. Note that the random writes it are a
552
+ worst case scenario, and in many cases MongoDB can do writes that are
553
+ much larger.
405
554
406
555
Redundant Array of Independent Disks
407
556
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -420,9 +569,8 @@ See also the ec2 page for comments on EBS striping.
420
569
Remote File Systems
421
570
~~~~~~~~~~~~~~~~~~~
422
571
423
- We have found that some versions of NFS perform very poorly and do not
424
- recommend using NFS. See the NFS page for more information.
425
- .. TODO link to NFS page
426
-
427
572
Amazon elastic block store (EBS) seems to work well up to its
428
573
intrinsic performance characteristics, when configured well.
574
+
575
+ We have found that some versions of NFS perform very poorly and do not
576
+ recommend using NFS with MongoDB.
0 commit comments