Friday, 25 February 2011

Thoughts on MongoDB from a SQL Server Dev

As someone with a background in SQL Server development dating back to 2000, the whole NoSQL jazz has been something that's been on the radar for a while but never made it under the microscope so to speak. Partly because SQL Server is my comfort zone and there's plenty more to still learn about that; partly because I've had my career mapped out in my head and NoSQL didn't feature much in that vision; partly because until you have something real-world driving a push into a technology then you tend to have that technology remaining as a distant dot on the radar.

Back at QCon London last year, I had the opportunity to hear a bit more about the NoSQL world from some great speakers - I think the CAP theorem cropped up in most sessions I attended. The dot on the radar moved a bit closer.

Back To The Future
Fast forward a bit and the real-world driver to take a look into NoSQL technologies appeared. So for a few months, I've spent some time doing research - seeing what the different options out there are, what they offer, what they don't, what compromises you have to make etc etc. Cassandra, CouchDB, MongoDB, HBase, RavenDB...to name just a few. Now I'm not going to go into a comparison of each of those - that in itself would be a whole blog post. If you ask me which is the "best"...I'll say "It Depends". What I do know, is which one feels like a good fit for my environment.

Cinderella's Glass Slipper
MongoDB appeared as a real potential candidate pretty early on and after a bit more of a deep dive investigation into it, it became more and more obvious that it was just the right fit for certain use cases compared to the other technologies. This is not me saying "MongoDB is better than Cassandra" or "I see your RavenDB, and I raise you a MongoDB". Just that in my current scenario MongoDB feels like the best fit.

NoSQL with a capital NOT ONLY
The whole "NoSQL" term has been done to death. Yes, it's not the best term. Happily, from my experience, the majority of what I've seen has been around the principle of "Not Only SQL". And this is very much where I sit. It does not replace SQL Server; it adds value for a number of use cases and for a number of reasons. Choosing the right tool for the right job.

The point of this post is really to just summarise some of the likes and dislikes I have, as someone with a heavy background in SQL Server development. What I think is pretty cool and what is not so cool.

Thumbs up
In no particular order...
  • Cost
    It's free, open source. Can haz more scale? Just add hardware. Licensing costs need not apply (can run on Linux).
  • Schema-less
    If you need to support a flexible schema, MongoDB's document storage is a big plus. It doesn't mean you don't need to think about schema at all, it just means you can very easily model data of this nature and cope with changes without headaches.
  • Quick start & fast learning
    Getting started with MongoDB was quick and easy. There was no entry barrier. I can't fault how quick and easy it was to get up and running with the basics. Hacking around to pick up the more advanced stuff was also a pretty painless exercise too. Within a relatively short period of time, I started to be able to provide answers to questions on StackOverflow. Using the C# driver has been a largely very positive and intuitive experience.
  • Replica sets
    Configuring is simple, making scaling reads and failover pretty effortless. Want more redundancy or scaling of reads? Fire up another machine, add to the set and away you go. You do need to be careful to give thought to the oplog though.
  • Auto Sharding
    Again, configuring is simple. You do need to give very careful consideration to this up front when deciding on what keys you want to shard on. Once you've done that, sharding "just does it's stuff".
  • Community
    It has a good community behind it and that IMHO is very important. I don't like sitting in a cupboard on my own with the lights off. I like being a part of a bigger community - to learn from, work through issues with and to contribute back to.
  • Rapidly evolving
    MongoDB is rapidly changing and it's good to see bugs are being tracked and fixed in good time. There is also a fast flowing feature enhancement pipeline too, so you typically don't have to wait for a long time to get something.
  • Choose your consistency
    You can choose to have data replicated to a configurable number of replicas before returning if you wish to have stronger level of consistency. Depends on what value you put on certain bits of data, but the choice is yours. So you can trade off performance for consistency.

Thumbs down
  • Rapidly evolving
    OK, so I listed this as a "thumbs up" but it can also be a negative. You'll often find older (but not "old") blog posts/answers/articles on a given matter that are no longer true as things have changed since then. So it can be frustrating trying to find the current, correct information as things are changing quickly.
  • Information / Documentation
    The level of information on MongoDB pales in comparison to what there is for SQL Server. Obvious point maybe - but when you're used to MSDN, all the bloggers, activity on StackOverflow/AskSSC etc and #sqlhelp it can be quite a fall back down to Earth.
  • Map/Reduce performance
    Doesn't seem to be great. There are other options if you need greater performance, such as Hadoop, but that adds complexity and another technology into the mix.
  • Tooling
    Not the wealth of tooling as there is for SQL Server. Hopefully this will change over time.

I'm sure I've missed some points, but think I've covered the main points. My brain is eventually consistent so of course will update if I remember any more of note :)

Summary
Of course, using NoSQL technology like MongoDB involves some trade-offs and different mindset vs a traditional RDBMS. The important thing for me, is that it gives another option and is another feather in my CAP (theorem, *badoomtish*). I don't see it as a replacement as it is definitely not; it's just another tool that can be used to achieve an end goal. I'm looking forward to having the two side-by-side in harmony.

Hopefully, this gives a bit of insight into MongoDB from the point of view of someone coming from a SQL Server background.

You can find my blog posts on MongoDB thus far, here.

9 comments:

  1. I am currently working on an old database called UniVerse. Its main points are that the way data is stored is by record, then by values and sub values. The entire system can be both schemaless or have a schema. However, this application has been around for over 30 years and over time, the data structures have become a total mess. As you suggested, NoSQL approaches will have their uses, just as relational databases also have their uses. Thinking either approach will solve any problem is simply naive.

    ReplyDelete
  2. Thanks for the comments. Yes, using the right tool for the job is important!

    ReplyDelete
  3. I've recently started development on a new web site and MongoDB seemed to fite the bill as I need high performance. However I've relised recently that it doesn't fit as a whole database system replacement like you say. I'm thinking now I need both - have the MongoDB serving the high perf front end search side of things where it excells at in memory indexing etc. and have the SQL server linking in where the transactions matter. It's just too much work to do all that in mongo, or any other nosql for that matter.

    ReplyDelete
  4. I would like more info on why MongoDB can't replace Sql Server. I am a sql server DBA and I have a dev team trying to tell me it can. I have never heard of this MongoDB before.

    ReplyDelete
  5. @Anonymous - it all depends on the scenario. There is no "X is better than Y" generalisation that can be made. MongoDB and SQL Server are 2 completely different databases which both have their advantages for certain uses. Without knowing your specific environment, it's impossible to know whether MongoDB would or would not be a sensible fit.

    ReplyDelete
  6. Adrian I am curious did have you used MongoDB on your project?

    This was a good article. Unfortunately I have not seen much else written about Mongo by people who have a really good understanding of RDBMS. I really want to hear from someone who has used MongoDB on a significant project.

    Query flexibility and tooling are my two most significant concerns. I've never had a project where I did not end up writing queries I didn't anticipate originally - even if its just for one-off research. I don't know what I don't know and so on - and I think Mongo will tie my hands when it comes to that. People ignorant of RDBMS will not even notice there is something missing while writing their elaborate programs to get at the data they need.

    ReplyDelete
    Replies
    1. @Jeremy, thanks for your comments. I have not currently got to stage of having MongoDB running in a production environment (not due to any problems surrounding MongoDB, just due to change of job). However, I have spent a reasonable amount of time prototyping NoSQL dbs for real world projects and got to the stage where MongoDB came out as the right fit and started implementation.

      Re: Query flexibility, this is one of MongoDbs strengths vs. other NoSQL dbs as it has good query functionality...plus supports secondary indices so you can index any field in your documents. This was one of the key things I needed.

      Feel free to drop me an email - happy to be quizzed if you have any more questions in more detail (contact details are further up the page)

      Delete
  7. Excellent post Adrian, it's about the same conclusions I came too with my extendsive database background.

    ReplyDelete
  8. This was a good article. Unfortunately I have not seen much else written about Mongo by people who have a really good understanding of RDBMS. I really want to hear from someone who has used MongoDB on a significant project.

    ReplyDelete