Friday, 18 March 2011

MongoDB Journaling Performance - Single Server Durability

Note: Relates to v1.8.0 release.

v1.8 of MongoDB (production release hot off the press, available from here) brings single server durability to MongoDB via its support for write-ahead journaling. The following summarises the key points from the journaling documentation, administration guide and from my initial playing around.
  • you have to specify the --journal switch for mongod in order to enable journaling (was previously --dur prior to production release)
  • journal files are created within a journal subfolder of your dbpath, up to 1GB each in size
  • in the event of a crash, when mongod restarts it will replay the journal files to recover automatically before the server goes online
  • if mongod shuts down cleanly, the journal files are cleared down. Also note that you shouldn't end up with a lot of journal files as they are rotated out as they are no longer needed
  • batch commits are performed approximately every 100ms (note the mention in the docs that this will be more frequent in future)
  • preallocation of journal files may be done when mongod starts up. If it decides this is worth doing, it will fully provision 3 x 1GB files BEFORE accepting any connections - so there could be a couple of minutes delay after starting up before you can connect. These preallocated files are not cleared down when mongod is shutdown - so it won't have to create them each time.
  • when running with journal files on a SATA drive, I found that it chose not to preallocate them. When I set up a junction to map the journal folder onto a separate 15K SAS drive, it did then choose to preallocate the files.

Durability vs. performance - what's the cost?
I wanted to get an idea of how much this journaling costs in terms of performance. Obviously there's going to be a hit, I just like to get a rough feel for just how much. So I took a test 295MB CSV data file containing 20 million rows of data with 2 columns: _id (integer) and val (random 5 character string) and loaded into a fresh database, with/without journaling enabled.

Tests were run on a single machine, Intel Xeon W3520 Quad Core, 10GB RAM, Disk 1=7200RPM SATA, Disk 2=15K SAS, Win 7 64Bit. MongoDB data is on Disk 1 (slower but larger disk).

Journaling?Journal locationFiles preallocated?Import time (s)Avg. Import Rate/sPerformance
Non/an/a29168729 
YesDisk 1 (same as data)No44245249-34%
YesDisk 1 (same as data)Yes (manually*)42247393-31%
YesDisk 2 (separate from data)Yes44844643-35%

(*) mongod always chose not to preallocate when the journal directory was on the slower disk (same as the data) so I had to manually preallocate the files by copying the ones that were created when the faster disk was used.

I couldn't run the test with the journals on disk 2 without them being preallocated because they were always preallocated and you can't delete them all while mongod is running.

Summing up
In my tests, I found:
  • using --journal resulted in about a 30-35% drop in throughput for my mongoimport job (just under 70K docs/s down to less than 50K docs/s)
  • preallocating of journal files (as you'd expect) helps as it doesn't have to create the files as it's going along
  • setting up a junction on Windows to map the journal directory onto a separate (and faster) disk to where the data is resulted in slower performance, presumably due to the overhead of the junction redirects. If there's another/better way of doing this I'd be interested to know. Also I haven't run this on Linux, so maybe there would be less of a hit on that. Personally, I'd like to see support for the journal folder to be explicitly configurable so you can point it at a separate disk.
Based on these results, for me, the decision on whether to use journaling or not would come down to how much I actually need single server durability. Is it critical for my specific use? Could I live without it and just use replica sets? What is the value / importance of it for my data vs. raw performance? Let's not forget this is a new feature in MongoDB. As stated in the docs, there are a number of cases in 1.8.1 for performance improvements with regard to journaling. Definitely something to keep an eye on.

5 comments:

  1. How many times were each of the tests run?

    ReplyDelete
  2. I ran each test 2/3 times each - there was variation in the times as I'd expect to see, but the relative performance in comparison with each other was consistent with the above figures.

    ReplyDelete
  3. You left out an important variable: what type of write operations you are doing. The journalling cost is related to how much (unique) memory (pages) you are changing over time. If you are doing frequent updates (like a counter for aggregation) to the same documents (in-place) then you could see an increase in performance as compared to lots of unique inserts (which touches more memory).

    ReplyDelete
  4. Thanks for the pointers Scott! It sounds like it would be worth me testing out with different insert/update ratios to see how the figures come out. And also a test on a 100% update scenario to go right out to that extreme.

    ReplyDelete
  5. Hello
    Thanks for this test !
    Could u make this test with version 2.0.1 ?

    Best regards

    ReplyDelete