Should You Switch To NoSQL Too?

Filed in by Jonathan Ellis | February 25, 2010 2:00 pm

With Twitter going public with their plans[1] to switch to Cassandra[2], a lot of people are asking if they should switch too[3].

Ian Eure from Digg (also switching to Cassandra[4]) gave a great rule of thumb last week at PyCon[5]: “if you’re deploying memcache on top of your database, you’re inventing your own ad-hoc, difficult to maintain NoSQL database,” and you should seriously consider using something explicitly designed for that instead.

While there are other reasons to use Cassandra, such as the less-rigid schema or the best-in-class support for replication across multiple datacenters, most people are using it because a single machine can’t handle their query volume.  This is the case for other distributed nosql databases as well, although there are other kinds[6] that are also useful for some applications.

The price of scaling is that Cassandra provides poor support for ad-hoc queries, emphasizing denormalization instead[7].  For analytics, the upcoming 0.6 release (in beta now) offers Hadoop map/reduce[8] integration, but for high volume, low-latency queries you will still need to design your app around denormalization.

So, NoSQL  systems are not drop-in replacements for a relational database.  It looks like Twitter has been working on moving to Cassandra at least since Evan’s post in July last year[9]; your application may be easier or harder to port, but it’s a useful data point to keep in mind.

For more on why the difficulty of scaling relational databases is driving developers to Cassandra, see the video[10] or slides[11] from my talk last week at PyCon, and for an introduction to Cassandra itself, see Eric Evans’ talk at FOSDEM, also available with video[12] or slides[13].

Related Posts: The Cassandra Project[14], How Do You Put Database In The Cloud?[15],  NoSQL Ecosystem[16]

Endnotes:
  1. Twitter going public with their plans: http://nosql.mypopescu.com/post/407159447/cassandra-twitter-an-interview-with-ryan-king
  2. Cassandra: http://incubator.apache.org/cassandra/
  3. asking if they should switch too: http://stackoverflow.com/questions/2332113/switching-from-mysql-to-cassandra-pros-cons
  4. switching to Cassandra: http://about.digg.com/blog/looking-future-cassandra
  5. PyCon: http://us.pycon.org/2010/about/
  6. there are other kinds: http://www.rackspacecloud.com/blog/2009/11/09/nosql-ecosystem/
  7. emphasizing denormalization instead: http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model
  8. Hadoop map/reduce: http://hadoop.apache.org/mapreduce/
  9. Evan’s post in July last year: http://blog.evanweaver.com/articles/2009/07/06/up-and-running-with-cassandra/
  10. video: http://pycon.blip.tv/file/3261223/
  11. slides: http://www.slideshare.net/jbellis/what-every-developer-should-know-about-database-scalability-pycon-2010
  12. video: http://www.parleys.com/#sl=1&st=5&id=1866
  13. slides: http://www.slideshare.net/jericevans/the-cassandra-distributed-database
  14. The Cassandra Project: http://www.rackspacecloud.com/blog/2009/09/23/the-cassandra-project/
  15. How Do You Put Database In The Cloud?: http://www.rackspacecloud.com/blog/2009/10/28/how-do-you-put-a-database-in-the-cloud/
  16. NoSQL Ecosystem: http://www.rackspacecloud.com/blog/2009/11/09/nosql-ecosystem/

Source URL: http://www.rackspace.com/blog/should-you-switch-to-nosql-too/