Support: 1-800-961-4454
1-800-961-2888

Rackspace and Drizzle: It's Time To Rethink Everything

33

Rackspace has been a sponsor of the Drizzle project for the past year. Recently we hired a team of five developers (Lee Bieber, Eric Day, Monty Taylor, Jay Pipes, Stewart Smith) from Sun Microsystems who have been working on this project. We think this project has a very bright future, and have put our money where our mouth is. SQL databases still have a place in the cloud, and Drizzle is just the thing to fit that need. We believe in open source, and are willing to invest in it. Here’s a quick intro to the new members of our team:

Before I wow you with all the great things inside Drizzle, I’d like to remind you that we are in a changing world. We are generating more data than ever. In fact, the world is generating an order of magnitude more data than just 5 years ago. We are now in the world of 64-bit computing. Our CPU’s have 4 and 8 cores on them, and it’s not uncommon to see servers with 24 or more total CPU cores today. The way CPU technology is going we should expect the core count to continue to increase. We fit tons of RAM into servers today. It would be really nice to be able to put all this great new hardware to effective use. The database software widely used for storing, managing, and querying our data are based on realities that are simply no longer true. An adapted approach is required in order to keep up. For some, the solution is a distributed database like Cassandra or one like it. For most, the solution will be something that can still be used with SQL, but is better suited for today’s data needs, and the server systems that will be here in the future, not just today.

The Drizzle project was started by Brian Aker from MySQL. Brian served as Director of Architecture at MySQL, and after MySQL was acquired by Sun he became a Distinguished Engineer. It’s hard to argue with the fact that Brian is one of the smartest people alive today that write open source database software. He keeps good company too. As I have grown to know the team from Sun I’m thrilled to have the honor to work with such a talented group. But it does not end there. Drizzle has a vibrant open source developer community, made up of over a hundred contributors worldwide.

In the very beginning, before Drizzle even had it’s name Brian’s idea was to start with MySQL 6.0 and fork it, taking out all the bloat like stored procedures, triggers, MyISAM, and numerous internal evils that made MySQL 5.1 too bloated for the needs of Web and Cloud use cases. In short, take MySQL and strip out all the fat to make a lean mean lightweight database that would really make sense for web applications and cloud computing. These plans evolved quickly from his unique perspective within MySQL not only about what works and does not work within the software, but how licensing and community development matter as well.

The Drizzle Vision

When it’s ready, Drizzle will be a modular system that’s aware of the infrastructure around it. It does, and will run well in hardware rich multi-core environments with design focused on maximum concurrency and performance. No attempt will be made to support 32-bit systems, obscure data types, language encodings or collations. The full power of C++ will be leveraged, and the system internals will be simple and easy to maintain. The system and its protocol are designed to be both scalable and high performance.

Keep the Good Stuff, Dump the Bad

When people first discover Drizzle one of the first things they want to know is how it’s different from MySQL. Some love MySQL and some hate it, and both sides have good reasons. MySQL is a mixed bag. Drizzle will re-think everything, keeping the best, and replacing or eliminating the worst. For example, MySQL has a data type called a 3-byte integer. Think about that for a moment. On today’s server hardware that does not make a whole lot of sense. In Drizzle, it’s gone. Poof. Take another example where MySQL allows you to send two queries together separated by a semicolon. This is widely abused as a security weakness in a multitude of application software. In Drizzle, an error is returned when you attempt this. Poof. This improvement eliminates an entire category of simple security attack vectors. If I listed every single difference this blog post would be way too long, so I’m going to highlight what I see as the best of the best:

1 – Micro-Kernel Design, with modular interfaces to plug in features and functions without touching the core code. API’s for Replication, Storage Engines, Logging,  Authentication, Client Protocols, etc.
2 – No triggers or stored procedures. That stuff is bloat as done in MySQL, and Drizzle has other ways to deal with these needs. These capabilities can be added in later as needed such that they are done right.
3 – Only UTF8 support, not a multitude of language encodings and collations. Keep it simple. This is the web.
4 – Way Less Source Code, where MySQL has well over a million lines of code, Drizzle is just under 300K lines.
5 – Drizzle Client Protocol, that’s pluggable and Asynchronous capable with built-in sharding support and built-in checksum support and BSD licence so you can package it in commercial software with no license drama.
6 – Default Storage Engine is InnoDB, for ACID compliance but others can still be used. MyISAM is gone. Long live the Queen!
7 – Pluggable AAA, so integrating with your LDAP user database through PAM is simple as pie, if if you don’t want any auth (think memcached) just don’t load the plugin, and get a nice performance boost.
8 – Replication will be everything you ever wanted in a replication system. You hate MySQL replication? You now love Drizzle.
9 – Logging is pluggable so you can log to Syslog, a query analyzer, Gearman, or whatever you want to plug in.
10 – Query Rewriting is supported. If you have a misbehaving application, you can fix it at the database if desired.
11 – The Data Dictionary is redone so that no internal tables materialize, and there is only a single code execution path. This means all your base run faster.
12 – FRM Files are Gone. Yep, ever been snagged by the contents of an FRM file not matching what’s in the database? Cry no more.
13 – The MySQL Client Protocol is supported so you can use Drizzle with most applications without modification.
14 – It’s Easy to Build from Source. Ever tried to compile MySQL from source. Hah! Yeah, drizzle builds like butter.

So you should believe me by this point that it’s not just a little fork of MySQL with all the same bugs. It’s a new animal completely that’s re-using what MySQL did well, and eliminating or fixing where it was weak.

The Results

First of all, you’re wondering if Drizzle is production ready. It is very close. It’s time now to start working with it in preparation for full production status. If you trust your data to MySQL with InnoDB today, you can trust your data in Drizzle too. Try it in a test application and see how it’s different. You may notice that it can handle thousands of queries running concurrently. You may notice that it scales much better than a MySQL server over multiple CPU cores.

There’s much more to come as well. I’ll continue to let you know when new and exciting milestones are crossed. If you’re a developer, and want to participate, start here. You can also find the team on the Drizzle IRC channel #drizzle on Freenode.





About the Author

This is a post written and contributed by Adrian Otto.

Adrian serves as a Principal Architect for Rackspace, focusing on cloud services. He cares deeply about the future of cloud technology, and important projects like OpenStack. He also is a key contributor for and serves on the editing team for OASIS CAMP, a draft standard for application lifecycle management. He comes from an engineering and software development background, and has made a successful career as a serial entrepreneur.


More
  • Pingback: drizzle.org

  • Pingback: 3/14/2010 Update « CloudRoad

  • Pingback: Drizzle on the Rackspace Cloud Blog | Ramblings

  • Pingback: Links of interest for March 13th through March 14th

  • Pingback: Rackspace Blog | Weez.com

  • Pingback: 3/15/2010 Update « CloudRoad

  • Pingback: Thoughts on Drizzle « JZ Talk Blogger

  • philip andrew

    How does it scale? Cassandra can allow me to have more data storage just by adding servers.

    Does Drizzle allow this? Or does it only replicate data across to the other servers? Drizzle needs to make this type of concern obvious.

    • http://adrianotto.com Adrian Otto

      Philip,

      Thanks for the great question. To get what you are describing where you simply add nodes to automatically add storage capacity without anything special in your application, you would want to use a storage engine that has those features built into it. Drizzle has a pluggable storage engine API, so any of the ones that work for MySQL can be adapted to work with Drizzle with minimal development effort by the storage engine vendor. There are a few options for that coming to maturity right now, including one from Akiban, and another from ScaleDB which will both be compatible with Drizzle in the future. There is one called Spider for MySQL, which I expect could be easily adapted as well.

      What will be built into drizzle is a planned update to the client library (yes this will break backward compatibility for existing Drizzle clients) to use the sharding hints in the client protocol. This is a way that your application can indicate to Drizzle which node in a replicated Drizzle configuration should process a given query. This approach requires that you have some client-side strategy to re-balance the queries, and related storage as your needs change. This arrangement would work very nicely for the classic read/write splitting setup where you have replication from a master that replicates (synchronous replication is an option for strong data consistency if needed) to a number of slaves that are used for parallel reads for scale-out.

      One key advantage that Cassandra has over Drizzle in its current state of development is that Cassandra has the re-balancing and redistribution of data built in, and your application does not need to know anything about it. This keeps the application more simple, which keeps your development speed up and your bug count down.

      Imagine for a moment a Drizzle storage engine interface to Cassandra. You get all the simplicity of managing growing data by virtue of adding servers to your Cassandra system, along with a simple SQL interface to that data for your application. Depending on what your application does, this may work very nicely. Stay tuned if something like that sounds good to you.

      Cheers,

      Adrian

  • Pingback: It’s Drizzling in the Clouds | php|architect

  • http://www.gameranx.com/ Atul Kash

    I use the rackspace cloud service for my sites and I am totally behind this!

  • http://www.copperconferencing.com Tim

    Will you be providing Drizzle db’s for Cloud Sites? Please!

    • http://adrianotto.com Adrian Otto

      It’s a treat when I get to reveal a secret about what we will do in the future. Ready? Yes, we will have a hosted Drizzle service (starting with a beta in the near future) for Cloud SItes. It will be an optional drop-down as a database type that you can select when you create a new site.

      Cheers,

      Adrian

      • http://travellperkins.com Travell Perkins

        Awesome! Any chance a next gen scalable storage engine will be included in the offering? Maybe as part of a coming out party for Akiba ;) ?

        • http://adrianotto.com Adrian Otto

          Travell,

          That’s certainly possible, but chances are that our beta release will just be vanilla drizzle with the InnoDB storage engine to start with, and we will add in the more sophisticated storage engines and cluster features over time. Doing a more limited set of functionality well is a way for us to curb complexity and offer you a better service out of the gate.

          Adrian

          • http://travellperkins.com Travell Perkins

            Amen Otto,

            Let’s make “uptime” feature #1 then ;)

      • http://www.copperconferencing.com Tim

        Sweet.. Please add me to the beta then. :)

  • http://www.endertech.com Rob

    I’m under the impression this integration of Drizzle was due to the growing pains and a hedge against the Oracle acquisition of MySQL.

    Are there any opinions or plans to integrate MariaDB or opinions/plans to follow in suit of Red Hat with similar investments like EnterpriseDB/PostgreSQL?

    Thanks

    • http://adrianotto.com Adrian Otto

      Rob,

      We do have a relationship with the Askmonty group, and consider them friends. You will notice that askmonty.org is hosted on our cloud. I like MariaDB, and think it will succeed as well.

      It’s unlikely that we will offer all open source database software in the form of database services, but we will certainly use the best of them, and the ones that work well with the applications our customers run most.

      Using an environment like Cloud Servers you have the freedom of choice to run any database software you wish. The ones that are the best fit for running in the cloud are the ones that we’ll choose to be part of our Cloud Sites platform. At the moment, Drizzle is the best fit, so we’ve acted accordingly.

      Cheers,

      Adrian

  • Pingback: The Oracle Effect: Sun’s Best and Brightest Move On to New Places | HERTZ RENTAL

  • Pingback: The Oracle Effect: Sun’s Best and Brightest Move On to New Places | Tech News

  • http://www.scaledb.com Mike

    Your comment: “It’s hard to argue that Brian is one of the smartest people alive today that write open source database software.”

    I think you meant to say it is hard to argue that he ISN’T one of the smartest….or maybe you don’t mean that ;-)

    Good luck and we look forward to working with you and the Drizzle crew.

    • http://adrianotto.com Adrian Otto

      I suppose it would have been more clear to say “It’s hard to argue with the fact that Brian…”. I’m edited the post accordingly. Brian is sharp, and we’re thrilled with his work.

  • Paul Morel

    It is great that you are supporting the Drizzle project.

    When will we see a Drizzle service for those of us that are not on CloudSites? We would love to have a service for SQL databases.

  • Pingback: MySQL Fork Drizzle Finally Releases First General Availability Version | Tech News Ninja

  • Pingback: MySQL Fork Drizzle Finally Releases First General Availability Version - Finding Out About

  • Pingback: MySQL Fork Drizzle Finally Releases First General Availability Version

  • Pingback: MySQL Fork Drizzle Finally Releases First General Availability Version | JetLib News

  • http://www.xeround.com Avi Kapuya

    I wanted to find out how stable is the Stroage engine API, is it ready for implementation of new storage engines?

    • http://adrianotto.com Adrian Otto

      Avi,

      According to Brian Aker, the Drizzle community project leader “It would be fine, I don’t forsee any changes that will be occurring in any strong manner in the near future.”. If you have futther interest, the best place to start is drizzle-discuss@lists.launchpad.net. We look forward to your contribution!

      Cheers,

      Adrian

  • Pingback: Introducing Drizzle – a lightweight SQL database for cloud and web | SoftElegance

  • Pingback: Rackspace Cloud Computing & Hosting

  • Sekhar

    Give one examle for jdbc connectstring in drizzle

Racker Powered
©2014 Rackspace, US Inc.