See, its a whale! With containers! On its back! Like discworld but a whale instead of a turtle.
Ever since I first played with User Mode Linux (UML) back in the days of Linux 2.4 I’ve been working with virtualisation, normally being involved in server virtualisation activities wherever I’ve worked. The project I’m leading right now at Southampton is the conversion of our entire physical server estate to virtual on VMware.
Despite living and breathing these technologies I’ve never actually liked x86 virtualisation. It is a terrible waste of code and processor time. It virtualises the entire hardware platform as if the guest OS is actually running on real physical hardware – but why? And even this isn’t entirely true anymore – in all modern virtualisation products the guest OS is fully aware its being virtualised, there are tonnes of ‘tools’ and ‘drivers’ running facilitating communication between guest and hypervisor. Its thus a hybrid – a mess of different approaches and compromises.
I entirely blame Microsoft for the growth of this odd x86 virtualisation market. Outside of the x86 world IBM and Sun created hardware level virtualisation and OS-level virtualisation, but in x86 land, because of the proprietary and slow-moving nature of Windows, vendors sprang up creating the x86 hybrid virtualisation model – part hardware, part software. It meant you could run Windows inside a virtualised container and make better use of hardware – at the cost of enormous overheads and massive duplication of data. One of the most ridiculous things from an architecture perspective is every x86 VM solution emulating a PC BIOS or UEFI firmware instance for every guest. Whatever for!
So for a long time I’ve been hoping that “OS-level” virtualisation would eventually assert itself and become the dominant form of virtualisation. I think it hasn’t because Microsoft joined the x86 virtualisation party by buying Hyper-V and rushing off to compete with VMware and so the market has carried on down this odd virtualisation path. Architecturally there will always be a place for this type of virtualisation, but the vast majority of servers and virtual desktops don’t need this. They don’t need to pretend to be running on real hardware. They don’t need to talk to a fake-BIOS. Clearly the x86 virtualisation vendors think this too as each new generation of product has mixed more ‘paravirtualized’ components into the product – to improve performance and cut down on duplication.
So whats the alternative? Real OS-level virtualisation! There are lots to choose from too. Solaris has Zones/Containers. FreeBSD has jails. AIX has WPARs. HP-UX has HP-UX containers. Linux predictably has lots to choose from: OpenVZ, VServer, lmctfy and LXC to name a few (and predictably, until recently, none were in the upstream kernel). LXC is the one everybody was talking about. The idea was to put acceptable OS-level virtualisation components into the kernel rather than just taking OpenVZ and shoving it in the kernel, which would have ended badly and never been accepted. So LXC has taken a long time to write because of this and somewhat has lost its ‘new! exciting!’ sheen.
LXC remains however the right architectural way to do virtualisation. In LXC, and all the other OS-level technologies, the host’s kernel is shared and is used by the guest container. No hardware is virtualised. No kernel is virtualised – only the userland components are. So the host’s kernel is still doing all the work and thats what the guest operating system uses as its kernel. This eliminates all the useless overheads and allows for easy sharing of userland components too – so you don’t have to install the same operating system N times for N virtualised guests.
Sadly everybody’s experience with LXC for the past few years was along the lines of “oooh, that sounds awesome! is it ready yet?” and usually the answer was “not yet…nearly!”. All that changed last month though as LXC 1.0 was released and became ‘production ready’. Yay! All we needed now I thought was for all the Linux shops to switch away from bulky x86 full fat hypervisors and start moving to LXC. Instead, by the time LXC 1.0 was released, something else has come along and stolen the show.
Enter Docker. Now, Docker actually is LXC. Without LXC, Docker wouldn’t exist. But Docker extends LXC. Its the pudding on top which makes it into a platform literally everybody is talking about. Docker is not about virtualising servers, its about containerising applications, but uses LXC underneath. The Docker project says that the aim is to “easily create lightweight, portable, self-sufficient containers from any application. The same container that a developer builds and tests on a laptop can run at scale, in production, on VMs, bare metal, OpenStack clusters, public clouds and more.”
So when I realised Docker was getting massive traction I was displeased, because I wanted LXC to get this traction, and docker was stealing the show. However, I had missed the point. Docker is revolutionary. I wanted LXC to kill all the waste between the hardware and the server operating system’s userland components – the parts that are my day job. Docker wants to kill that waste, and all the waste in the userland of the operating system as well – the parts I hadn’t considered being a problem.
For years vendors and open source projects have produced applications, released them and asked for an IT department to install and maintain operating systems, install and maintain pre-requisite software and then install the application and configure it. Then usually another team in the organisation actually runs and maintains the application. Docker has the potential to kill all of that waste. In the new world order the vendor writes the code and creates a container with all the prerequisite OS and userland components (except for the linux kernel itself) and then releases the container. The customer only has to load the container and then use the application.
It is then a combination of the fairly well established “virtual appliances” seen in VMware/KVM/Hyper-V land, but with all the x86 hypervisor waste removed.
This has many benefits:
- The software vendor doesn’t have to maintain a full operating system that is expected to work on any number of virtualisation solutions and different fake hardware models. They only have to target LXC, with the host kernel doing all the difficult work.
- The software vendor can pick and choose whatever userland components they need and properly and fully integrate the application with the userland OS.
- The software vendor takes care of patching the userland OS and the application. The patching process is integrated. No more OS patches breaking the app. No more OS patching for the IT department to do.
- The customer IT department’s work is radically and significantly reduced. They only have to deploy the container image – a very easy procedure – and within seconds have a fully set up and ready to use application.
- And end to dependencies, prerequisites, compatibility issues, lengthy installation, incorrect configured operating systems and applications
- And all the benefits of LXC – low overheads, high performance, and end to the duplication of the same operating system.
- And end to having to upgrade and move applications because the guest server operating system is now end of life – even if the application isn’t.
So, today’s IT platforms probably consist of:
- A farm of physical servers running a hypervisor platform like VMware or KVM
- Hundreds if not thousands of virtual machines running only 2-3 different operating system flavours (e.g. RHEL5/6 or Windows Server 2008/2012) with a small number of VMs (<10%) running exoctic different things
- Teams of infrastructure people maintaining the guest operating systems and using OS-level management systems such as RHN, Landscape, Puppet, Chef, Cfengine, Runit, etc and spending a lot of time patching and maintaining operating systems.
- Teams of application people, usually without root, or even worse with root, having an uneasy relationship with infrastructure teams, installing applications and patching them (or probably not patching them) and maintaining them.
If Docker catches on the way I’d like it to (beyond what even the Docker project envisaged) then I think we’d see:
- A farm of physical servers running an LXC hypervisor Linux OS
- Hundreds if not thousands of Docker containers containing whatever the vendor application needs.
- Teams of application people using the vendor supplied web-interfaces to manage the applications, patching them using vendor patching systems which integrate all the components fully, or just by upgrading stateless docker instances to the latest version.
It seems that this vision is already a reality: https://coreos.com/. CoreOS envisages applications packaged as ‘Docker’ containers, and CoreOS as the minimalist platform hypervisor underneath. The IT departments’ sole job would be to install CoreOS onto hardware and then load Docker containers as needed from vendors, open source projects, and internal software development teams.
This is all very new and cutting edge. Docker 0.9 was only released a few weeks ago. CoreOS’s latest version is a major change. Other exciting areas of development with Docker are plans to let you swap out LXC and use OpenVZ or Solaris Zones or FreeBSD jails instead, thus opening Docker up to Solaris and BSD too. This is a very exciting new frontier which, if successful, will totally re-write how the IT world works. I can’t wait to see what happens next.