Monday, 21 February 2011

MongoDB replication - oplogSize

Background
An oplog is a write operation log that MongoDB uses to store data modifications that need to be replicated out to other nodes in the configured set. This oplog is a capped collection which means it will never grow in size beyond a certain point - once it reaches it's max size, old data drops off the end as new data is added so it keeps cycling round. The size of the oplog basically determines how long a secondary node can be down for and still be able to catch up when it comes back online. The bigger the oplog size, the longer you can deal with a secondary node being down for as the oplog can hold more operations. If a secondary node is down for too long and all the operations required to bring it back up to date are no longer available in the master oplog, then you'll start seeing "Error RS102". See "Resynching a Very Stale Replica Set Member".

oplogSize
Configuring the size of the oplog is an important decision to make up front and so you should give it careful consideration to allow for future write loads. You can configure the oplogSize via the --oplogSize command line argument for mongod (given in MB).

e.g. to set 10GB oplog size:
mongod --oplogSize 10240 --replSet ExampleSet1

Personally, I know I'd err well on the side of caution and set it larger rather than smaller to really minimise the risk of the oplog not being big enough to allow secondary nodes to catch up after falling behind/being offline for longer periods of time.

1.6.5 Bug
Setting this argument appears to not work on Windows 7 64Bit in 1.6.5 (not sure about Linux) and you end up with a seemingly random size being created instead. Also, you may encounter an error depending on the value you specify. It appears multiples of 2048 will produce the following error:
Assertion failure cmdLine.oplogSize > 0 db \db.cpp 962
Due to the nature of the bug, it appears as though the max size you could end up with is 2047MB in that version. A bug case has been raised here however this has been fixed in 1.7.x (currently not production release at time of writing) and can confirm I have this now working as expected.

Update 23/02/2011:
Kristina Chodorow (Twitter | Blog) has written a blog post on "Resizing Your Oplog" - well worth a read if you fall into the boat where you have a system already up and running and realise that you need a bigger oplog. You'll need 1.7 for this.

No comments:

Post a Comment