Recently, I had the chance to work on a couple of projects that took me into the cloud. The first project had me setting up Eucalyptus on KVM. The second had me building out an infrastructure in Rackspace Cloud Servers. This has given me some hands on insight into the problems that are facing those of us trying to use the cloud for infrastructure build out. Since Amazon and Rackspace are probably the largest and most stable providers and my projects took me down both paths, I decided to write an article with my insight to the whole cloud thing.
With Rackspace, Linode, or Amazon EC2, a server can be provisioned in 30-120 seconds, but with this kind of freedom comes new wants and needs. Since there is a cost associated with converting your infrastructure to a cloud mentality, e.g. load balanced and ephemeral servers, instead of static dedicated hardware, most users want to see extra value returned. If CentOS instances are cheaper at Rackspace for 10 months and then become cheaper at Linode, then I want to be able to move them with the click of a button. Well, we are not quite there yet, but we are getting closer because of projects like Jclouds and Libcloud which wrap several providers APIs with their own APIs. If/when you go to automate a cloud build out, you will find that these tools are immensely useful.
Each of the cloud providers have APIs which are fairly simple to use, but when trying to interact with more than one, things can get hairy. Amazon’s SOAP interface is completely different than Rackspace’s REST API. This is where LibCloud/Jclouds comes in. These wrappers provide a single interface to all of the different vendor specific APIs.
From my perspective it seems that these wrapper APIs are being developed from the infrastructure up (LibCloud) and the application down (Jclouds). Depending on your perspective, either of these might fit your need. In a traditional systems administrator environment where cloud is only part of the teams responsibility, LibCloud seems more attractive. On the other hand Java startups that don’t even have a systems administrator may find Jclouds more comfortable.
The second major problem is resources. Operating system, disk space, memory, CPU, and bandwidth/throughput caps all become resources that are specified when a cloud instance is built. Usually, disk, memory, cpu, net are specified in a bundle, while operating system is specified separately. The problem is, IDs for specific operating systems and resource bundles are generally different between providers. For example, CentOS may be image ID 22 at Rackspace, image ID 3 at Amazon, and image ID 45 in your private eucalyptus instance. On the resource side, resource id 0 might mean 256MB of ram and one processor with 33Mbps bandwidth cap at Rackspace, while it means something else at Amazon EC2. LibCloud and JClouds can’t help you with this yet. You still have to keep track of the different resource bundles and image IDs, but at least you have Python/Java and the dictionaries they provide to help. It appears that several companies are starting to form around LibCloud to provide innovative features. CloudKick just got acquired by Rackspace and it seems there is a niche for more cross cloud companies to come about, especially one that could ease the data problem of different resource/images at different providers which would give a single point of entry.
This is a fast moving space and I suspect this problem may change rapidly, so I will post in the future as I interact with it.
I hope to have a chance to experiment with Overmind. Though the name is reminiscent of 1984 and some government plan to having me call my fellow citizens brother, it is actually a cool project built on top of libcloud/django. It provides a REST API to interface with many cloud providers.