Filed in by Jonathan Ellis | February 25, 2010 2:00 pm
With Twitter going public with their plans[1] to switch to Cassandra[2], a lot of people are asking if they should switch too[3].
Ian Eure from Digg (also switching to Cassandra[4]) gave a great rule of thumb last week at PyCon[5]: “if you’re deploying memcache on top of your database, you’re inventing your own ad-hoc, difficult to maintain NoSQL database,” and you should seriously consider using something explicitly designed for that instead.
While there are other reasons to use Cassandra, such as the less-rigid schema or the best-in-class support for replication across multiple datacenters, most people are using it because a single machine can’t handle their query volume. This is the case for other distributed nosql databases as well, although there are other kinds[6] that are also useful for some applications.
The price of scaling is that Cassandra provides poor support for ad-hoc queries, emphasizing denormalization instead[7]. For analytics, the upcoming 0.6 release (in beta now) offers Hadoop map/reduce[8] integration, but for high volume, low-latency queries you will still need to design your app around denormalization.
So, NoSQL systems are not drop-in replacements for a relational database. It looks like Twitter has been working on moving to Cassandra at least since Evan’s post in July last year[9]; your application may be easier or harder to port, but it’s a useful data point to keep in mind.
For more on why the difficulty of scaling relational databases is driving developers to Cassandra, see the video[10] or slides[11] from my talk last week at PyCon, and for an introduction to Cassandra itself, see Eric Evans’ talk at FOSDEM, also available with video[12] or slides[13].
Related Posts: The Cassandra Project[14], How Do You Put Database In The Cloud?[15], NoSQL Ecosystem[16]
Source URL: http://www.rackspace.com/blog/should-you-switch-to-nosql-too/
Copyright ©2013 The Official Rackspace Blog unless otherwise noted.