Yesterday was day two of the ZeroVM Design Summit. Day one mostly focused on the architecture and capabilities of ZeroVM itself; while day two examined integration with other technologies, particularly OpenStack Swift (the technology behind Rackspace Cloud Files).
Why Swift? One of the core tenets of ZeroVM is that we should bring the storage and compute together, so that we store things and do things in the same place. This drastically reduces the latency of computation and makes a number of new use cases possible. That is why one of our key principles is that ZeroVM should be able to be used “wherever the data is.”
When we looked at “where the data is,” we looked at three targets: local disks, cloud storage and databases. ZeroVM currently works on local disks, and we are also integrated with the OpenStack Swift object store. (Database integration will come later). We are working on ways in which ZeroVM and Swift can work together more powerfully.
ZeroVM’s Swift integration currently uses two separate pieces of middleware, one on the proxy and one on the object server. The proxy middleware parses job descriptions that are posted to the cluster by users. Each job description includes a number of targets within the swift cluster and a target ZeroVM application. The proxy middleware helps make sure that the object servers contain the correct ZeroVM application and then passes control to the object servers for actual execution. The object server middleware spawns the ZeroVM and maps the objects into the application container so that they can be acted upon.
In that vein, we looked at a number of enabling technologies coming from the ZeroVM community that we are proposing for upstream inclusion in Swift. The first – and most controversial – is “append” support for objects inside Swift. This would allow clients to add new data onto the end of an existing object without rewriting the object as a whole. Echoing John Dickenson’s post about this issue, append sounds pretty easy, but there are lots of corner cases that make it hard.
John mentioned that he is in favor of a solution that allows out-of-order appends to a file. That is one possibility, but we are still working out a proposal that would ensure in-order appends at the cost of having some appends not succeed. It is a modification of the two-phase commit protocol, with some specifics to tailor it to Swift. Either way, there was a lot of agreement as to specifics. We will be overloading the “fast post” path within Swift and using the (currently zero-length) metadata files to store the additional append data, and using etags (md5 hashes) to target specific revisions on files.
Another idea from the ZeroVM community is registering webhooks on URL resources with Swift. By allowing the upload, modification or even read to trigger an event, we can coordinate some fairly complex workflows in a convenient fashion. This would be useful for stream processing on inputs or for iterative building of secondary indexes, such as for search.
Other ideas from ZeroVM that we are trying to upstream include a “generic executor” for Swift (think FastCGI for Swift) and some additional data view capabilities, such as user-to-user data copies and authenticated read-only views on data.
Tomorrow I will report on the final day of the ZeroVM Design Summit. So far is has been a blast.