Intel provides view into cloud shift

Yesterday’s fascinating article in Wired says that Google is now Intel’s fifth largest client for server chips. In 2008, this Intel division had 75% of its sales to IBM, Dell and HP. Five years later, the same 75% is spread across eight buyers, the fifth being Google.

This provides a view of the shift away from owner-operator enterprise IT and towards buying compute as a service through Amazon or other IaaS providers.

It’s fair to assume that Amazon and Facebook are in that top eight either directly or by proxy with their manufacturer (Quanta is one such). The article also mentions Facebook’s open compute project, a strategy to reduce the acquisition and ownership costs of the hardware, as they cool more effectively, too.

IDC and fellow analysts don’t have solid data on how much of the total addressable server market is taken by this new breed of buyer. Wired cleverly called it the “server world’s Bermuda triangle” because of analyst’s poor visibility of spending in that zone.

The economic factors which favour IaaS providers AWS and Google Compute Engine:

  • They have equal or better buying power to IBM, Dell and HP
  • They spend less on energy. AWS and Google are run in data centres with energy efficiency (“PUE”) of about 1.2; about 50% lower than most enterprise data centres which are around 2 to 2.4. Facebook’s Oregon DC runs at 1.11. In Australia, the NSW Whole of Government new data centre is likely to have PUE of 1.29
  • Their scale plus automation systems drive down operational costs to a greater degree than an enterprise IT buyer can easily achieve.

At this scale, with the prior stalwarts of server sales losing ground to providers who don’t resell the chips but instead sell a service, the inexorable domination of cloud computing is obvious.

Facebook’s Oregon datacentre, a server aisle:
Facebook’s Oregon datacentre In 2008 Nicholas Carr gave his now-famous analogy in the Big Switch, that IT is leaving the company data centres, and shifting to cloud computing, mirroring the shift that electricity generation had from local steam-generation to the newly invented power grid.

Part of the reason enterprises are moving slowly to cloud is their (necessary) dependence on existing stable applications. That’s fair enough. All of my clients have systems which have been designed within the concept of traditional enterprise IT.

The next advance toward cloud comes from changes to the buying and architectural decisions of the IT organisation. It is to think of compute as a service, and to move to a service oriented architecture. For example, to design apps with the assumption of hardware will fail underneath it, as opposed to the current state where the apps can trust the hardware to be available > 99.9x% of the time.

Netflix put this eloquently in a zdnet article. In explaining the philosophical design shift, their cloud architect said:

The typical environment you have for developers is this image that they can write code that works on a perfect machine that will always work, and operations will figure out how to create this perfect machine for them. That’s the traditional dev-ops, developer versus operations contract.

Instead he says the way Netflix now do it is different. This is the point I’m making. Netflix:

We don’t keep track of dependencies. We let every individual developer keep track of what they have to do. It’s your own responsibility to understand what the dependencies are in terms of consuming and providing [services].

We’ve built a decoupled system where every service is capable of withstanding the failure of every service it depends on.

Everyone is sitting in the middle of a bunch of supplier and consumer relationships and every team is responsible for knowing what those relationships are and managing them. It’s completely devolved — we don’t have any centralised control. We can’t provide an architecture diagram, it has too many boxes and arrows. There are literally hundreds of services running.

A longer treatise on service oriented design is in Steve Yegge’s accidentally published rant about how well Amazon get it (and how Google don’t, but that’s now to be tested in Google Compute Engine). They started the transformation to a service-oriented architecture at Amazon in about 2002.

It was about 25 years ago that I started programming, and about 15 years ago I stopped. I never had to develop in this paradigm (but I keep wanting to try). I haven’t led an IT organisation through this scale of change. So I’m not naively saying this change is easy. Steve’s post gives some story to the huge difficulty of it.

I don’t think it’d be easy to change a large organisation’s IT from a traditional mindset to one that fully exploits cloud computing, but damn the rewards and journey would be awesome.