For over thirty years, relational database technology has been the gold standard. Modern workloads and unprecedented data volumes, however, are driving businesses to look at alternatives to the traditional relational database. This “NoSQL movement” has given rise to a host of non-relational-database technologies, designed for large-capacity storage and scalability. Some businesses may find that the best solution is a combination of both relational and non-relational databases—whichever tool is best for the job. In this regard, “NoSQL” is probably better referred to as, “Not Only SQL,” rather than “No SQL at all.”
NoSQL technologies vary widely, but they can be evaluated based on three key features: scalability, data and query model, and persistence design. Below we will investigate ten popular NoSQL database options.
In this context, “scalability” refers to scaling writes by automatically partitioning data across multiple machines. Systems that do this are called “distributed databases,” and they include Cassandra, HBase, Riak, Scalaris, Voldemort, and others. If your write volume or data size is more than one machine can handle, then these are your only options if you don’t want to manage partitioning manually. (You don’t.)
When choosing a distributed database, look for:
1) support for multiple datacenters and
2) the ability to add new machines to a live cluster transparently to your applications.
Non-distributed NoSQL databases include CouchDB, MongoDB, Neo4j, Redis, and Tokyo Cabinet, and they can serve as persistence layers for distributed systems. MongoDB provides limited support for sharding, as does a separate Lounge project for CouchDB, and Tokyo Cabinet can be used as a Voldemort storage engine.
NoSQL databases vary widely in regard to data models and query APIs: (Respective Links: Thrift, map/reduce views, Thrift, Cursor, Graph, Collection, Nested hashes, get/put, get/put, get/put)
Here are some highlights:
Persistence design, in this instance, refers to how the data is stored internally: By evaluating the persistence model, you can determine the best fit for your work load.
CouchDB’s append-only B-Trees provide an interesting variant by avoiding the overhead of seeks, unfortunately at the cost of limiting CouchDB to one write at a time.
In 2009 the NoSQL movement hit the ground running, and has continued to grow and more and more businesses wrestle with ways to process large volumes of data. Conferences and announcements, as well as discussion about the NoSQL movement are located on the Google discussion group.
© 2011-2013 Rackspace US, Inc.
Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License

0 Comments
Add new comment