"Cloud Cloud Cloud, if you're not in it, you're out!"... or something

Wednesday, November 18, 2009

After I graduated from the HIO Enschede (B.Sc level) in '94 I have worked with a lot of different platforms and environments: from 4GL's like System Builder, uniVerse and Magic to C++ on AIX to Java to Perl on Linux to C# on .NET. All these platforms and environments had one thing in common: their creators were convinced their platform was the best and greatest and easiest to write software with. To some extend, each and every one of them were decent platforms and it was perfectly possible to write software with them though I'll leave the classification whether they were / are the greatest and easiest to the reader. I'll try to make clear below why this dull intro is important.

Yesterday I watched the live stream of the PDC '09 keynote and in general it made me feel uncomfortable but I couldn't really figure out why. This morning I realized what it was and I'll try to explain it in this blog.

Cloudy skies

If one word was used more often than anything else in the keynote it was likely the word 'cloud'. Cloud, cloud, cloud, azure, cloud, cloud, azure, cloud, azure... and so on. Perhaps it's the weather in Seattle which made Microsoft fall so in love with clouds, I don't know, but all this cloud-love made me a little uneasy. This morning I woke up and realized why: it's too foggy. You see, the whole time I was watching the keynote, I had the idea I was watching the keynote of some conference about some science I have no knowledge about whatsoever.

"Cool, another guy talking about azure clouds with yet another set of fancy UIs I've never seen, giving me the feeling that not using those is equal to 'doing it wrong', but what the heck azure clouds are and what problem they're solving is beyond me". That kind of thing.

A long line of people were summoned on stage to tell something about some great tool / framework / idea / wizardry related to clouds and with every person I more and more lost grip about what problem they all wanted to solve. All I saw was a long line of examples of Yet Another Platform with its own set of maintenance characteristics, maintenance UIs, maintenance overhead and thus maintenance nightmares.

More UIs, more aspects about things which were apparently new to software engineering nevertheless utterly essential to writing good software... more UIs I've never seen before, more cloudy weather, more azure flavors, more UIs I've never seen, more...

"Aaaaarrrgg!"

As I've tried to explain in the first paragraph, I've been around the block a couple of times. I have lived through internet bubbles, read McNealy's 'The Network is the computer' articles / propaganda, shaked my head when I heard about Ellison's Java client desktop idea, waded through the seas of SOA and SOA related hype material, so I have a bit of an idea what "Big computer with software somewhere + you" means. In this 'modern age' it's dubbed 'Cloud computing', though to me it looks like the same old idea that has been presented by various people in the past but with new labels. With all these platforms presented in the past, there was really one issue: what was the problem they all tried to solve? Why would one want to use it? With Cloud computing, that same old issue hasn't been solved.

"I built it, you run it"

One aspect all these 'big computer with software + you' systems tried to sell was that they could run the software you wrote for you and you didn't have to worry about a thing. Well, not to worry about a lot, but still you had to worry about things, as the system was still Yet Another Platform with its own set of characteristics, flaws and weaknesses and most importantly: differences with the development- and test environment the software was written with.

The problem with software once it is written, tested and ready for deployment is that last stage: will it run in the environment on-site the way it runs locally in the test environment? And is that on-site environment easy to maintain?

In other words: the problem is that the environment the software has to run in isn't necessarily the same as the environment the software was written with / tested in, which could cause a lot of problems during deployment and after deployment. Other aspects like updating the environment due to security flaws, bugs in software etc. are also factors which add to the overall unpleasant experience of deploying and keeping software running.

So the answer to that problem should be a system which provides the following things:

The environment equal to the one the software was written and tested with
The resources to keep the software running when the software requires them.
The security that the software keeps running, no matter what.

In other words: the software engineers built the software, tested it and defined the environment (as they've done that for development and testing anyway) and shipped that in one package, and at the place where the software has to run, that exact same environment is provided, together with the resources required (like memory, cpu, a database connection). So "I built it, you run it". How the environment is re-created isn't important, the important thing is that the exact same environment is provided to the software, 24/7.

Are EC2, Azure and other cloudware solving the problem?

No. They provide Yet Another Platform but not the same environment. As they're yet another platform, you've to develop for that platform. The most typical example for that is that the newly announced application server from Microsoft 'AppFabric', has two flavors: one for Windows and one for Azure. Why would anyone care? Isn't it totally irrelevant for a system in the 'cloud' what software (or what hardware) it is running? All that matters is that it can provide the environment the developer asked for so the developer knows the software will run the way it was intended.

Let's look at a typical example: a website of some company with a small database to serve the pages, a small forum and some other data-driven elements, not really complex. Today, this company has to hire some webspace somewhere, database space, bandwidth and most importantly: uptime. To make the web application run online, it has to match the rules set by the hosting environment. If that's a dedicated system, someone has to make sure the system contains all software the web application depends on, that the system is secure and stays that way. If it's a shared hosting environment, the web application has to obey the ISP's rules of hosted web applications, e.g. can use 100MB memory max., can't recycle more than 2 times in an hour etc.

When Patching Tuesday arrives, and the web application runs on a dedicated server (be it a VM or dedicated hardware, doesn't matter), someone has to make sure that the necessary patches are installed, and that those patches don't break the application. Backups have to be made so if disaster happens, things can be restored. These all count as 'uptime' costs.

With a VM somewhere on a big machine this doesn't change, you still have to make sure the VM offers the environment the application asks for. You still have to patch the OS if a patch for it is released, you still have to babysit the environment the application runs in or hire someone to do that for you, but it always involves manual labor to make sure the environment online is equal to the environment during development and testing.

In the whole keynote I didn't hear a single argument how Microsoft Azure is doing this differently. Sure I can upload some application to some server and it is ran. However, not with the environment I ask for, but inside the environment Azure offers. That's a different thing, because it requires that the developer has to write software with Azure in mind. If I have a .NET web application running on a dedicated server which uses Oracle 10g R2 as its database and I want to 'cloudify' () that web application with Azure, I can't because I have to make all kinds of modifications, for example I have to drop the Oracle database for something else and also make other changes as the environment provided by Azure isn't the same as the one locally.

EC2 and other cloudware do the same thing, they all provide 'an' environment with a set of characteristics, but not your environment. So in other words, they're not solving the problem, they only add another platform to choose from when writing software. Like we didn't have enough of those already. Sure, they offer some room for scaling when it comes to resources, but what happens when the image has to reboot due to a security fix that had been installed? Is the application automatically moved to another OS instance? Without loss of any data in-memory, so it looks like the application just ran along fine without any hiccup?

So what's the solution? What should Cloud computing be all about instead?

It should be about environment virtualization. I give you a myapp.zip and an environment.config and you run it. And keep running it. All dependencies on software of my application, like 3rd party libraries, are enclosed in the application's image. That's not an image of an OS with the app installed, it's just the application. The environment.config file is a file which contains the description of the environment that the software wants, e.g. .NET 3.5 sp1, Oracle 10g R2 database, 2GB ram minimum, IIS7, domain name example.com registered to app, folder structure etc. etc. So I outsource any babysitting of the environment of my application.

That is incredibly complex. It might not even be doable. But it's the only way to make cloud computing something else than a new name for an old idea, despite the long list of well-known names who showed an even longer list of UIs and tools during a keynote.

Can Azure do what I described above? I honestly have not the faintest idea, even after watching the keynote yesterday and by reading up some marketing stuff. That doesn't give me confidence, as it's in general not a good sign if a vendor has a hard time explaining what problem a product solves.

The problem seems pretty obvious to me. Cloud computing itself is nothing new, it's just all the web apps like Gmail and Mint we have been running for years. The problem is:

1. Maintaining all the servers, including patching, backups etc that go along with running a server that hosts your product is a pain for many people.
2. Adding new servers when you are expecting and increase in requirement is painful, now you have just doubled the pain from point 1.

Azure will largely solve these problems.

Craig - Wednesday, November 18, 2009 12:29:32 PM

@Craig: you say Azure will largely solve these problems, but how will it do that? Does it provide to be a virtualized windows environment? Or do I run a VM with 2008?

I'm willing to believe what you say, but to me it's not clear that it will do that. For example Amazon's EC2 has the same selling point but also can't provide you a zero-maintenance OS layer: it provides a way to run an image (OS + app) at whatever spec you want, 1, 10 machines you name it. But that doesn't free you from patching tuesday. that your app can run on two VMs to provide 100% 'uptime' is not easy if your app isn't build with 'I run on multiple machines at once so there's no state' in mind.

FransBouma - Wednesday, November 18, 2009 12:42:30 PM

This article was a pleasure to read specially after having seen the following production from IBM. I didn't think it was possible to combine two buzz techs into the same product but they managed it.....

WebSphere CloudBurst Appliance
Extend Smart SOA applications and services into a private cloud

David Hope - Wednesday, November 18, 2009 12:56:09 PM

What if as a developer I want to be able to develop the app directly in the cloud space that the app will occupy while running. Does Azure offer that?

And as far as uptime is concerned, this "cycle" the machine is why I have moved to debian stable. The only reason you have to cycle the machine is to upgrade major kernel releases, and that's rare. Even with hardware upgrades, power failures and so on, it is not uncommon to have uptime exceeding 300 days. I don't know how that will translate with running debian stable on amazon ec3, but it might.

Christopher Mahan - Wednesday, November 18, 2009 1:12:50 PM

I thought one of ther main ideas around the 'Cloud' was reduced provisioning time when you need to increase capacity etc...

Awkward Coder - Wednesday, November 18, 2009 1:51:12 PM

We are years away from a useful solution or useful offer in the cloud space. Who wants to part with their sensitive data? The whole thing feels a little like a solution in search of a problem. There will be something good that comes out of this... but I dont think its what the pundits and marketing people proclaim right now.

Thomas Wagner - Wednesday, November 18, 2009 3:00:05 PM

The details of how the Windows Azure Platform solves the OS update problem haven't changed since last year's PDC. Maybe we didn't clearly communicate them enough.

Windows Azure is a virtualized OS currently based on Windows Server 2008 running on a specialized version of Hyper-V. You specify the number of virtual machines you want (per role) and we take care of the OS and app provisioning.

If the OS needs to be patched, a new VM is spun up based on the new OS image and your app is automatically deployed into that image. This new image comes online as soon as your app is initialized. The old machine is taken offline and recycled. If you have multiple machines per role, the load balancing will take care of the rest.

Updating your app or increasing capacity works largely the same.

Does this mean that your app needs to deal with state differently if you rely on in memory state? Yes! Your app should only rely on state in memory for caching and all data that needs to be persisted needs to be persisted on disk. This can either be persistent local storage (which *is* preserved across VM recycles), SQL Azure or Azure tables and queues.

But not relying on the persistence of in memory state is always a best practice. You need to be prepared for system failure at any time even if you run your app in your own datacenters on your own physical machines.

Regarding an identical test environment. You can use a staging environment that is identical to production. When you hit the take into production switch, this staging environment even instantly becomes the production environment. The development environment is different in scale and is simulated using the Dev Fabric provided with the Azure SDK and the Visual Studio Tools.

Erwyn van der Meer - Wednesday, November 18, 2009 7:23:57 PM

I guess you dont now about what problem the Cloud address. Please think about: resilency, on-demand scaling, business continuity and other problems that are becoming more and more complex to address inside organization.
Obviously to solve these problem the application must observe some rule by design. These rules are enforce on the cloud but not on the 99% of the other software platform.
Then, for a sysadmin like me, as these policy are enforced by infrastructure is much much simple to grow, to protect to enpower infrastructure

thebitstreamer - Thursday, November 19, 2009 7:30:17 AM

Wow!!Thats some thoughtful insight:)Never thought about azure this way

Haripraghash - Friday, November 20, 2009 12:49:14 AM

Cloudy skies

"I built it, you run it"

Are EC2, Azure and other cloudware solving the problem?

So what's the solution? What should Cloud computing be all about instead?

9 Comments