Only IT could come up with a word that is such an assault on the ears as “containerization.” We don’t “boxify” gifts on Christmas Eve, a fact for which we may give thanks the month before.
The word — if indeed it qualifies as one — refers to a concept of workload virtualization that transcends virtual machines as we have come to know them. Virtual machines (VMs) enabled existing workloads to transcend the boundaries of single servers, and later moved them outside the firewalls of their data centers.
But they didn’t exactly set them free from there, and that’s the point. Containerization (such as Docker) shifts the focus of data center management from virtual machines (servers that don’t know they’re not real) to applications, running within virtualized components that know where they are and what they’re doing.
The Name of the Game is 'Orchestration'
According to multiple surveys, most organizations that are experimenting with containerization for the first time are not seeing the benefits of this new approach. That’s because they’re running container platforms inside conventional VMs, which is a bit like freeing a bird by letting it loose inside a penitentiary.
The whole point of containerization is to enable the data center as a whole — whether it be entirely on-premises, partly in the cloud, or scattered across the planet — to serve as the single staging ground for applications. In the 1990s, servers were traps for applications. In the 2000s, virtualization permitted “mobility” by moving the traps back and forth.
Now, containerization busts open the locks and lets applications roam around the data center. The problem now is finding the right management model to apply. There’s no all-purpose solution to this dilemma, only multiple strategies — many of which are actually ideas that could, at some point, congeal to become strategies.
In the containerization realm, managing multiple workloads is called orchestration. If you’re a musician, you’ll appreciate the analogy. The “orchestrator” is really the conductor of this operation. Your choice of conductor may not only depend upon, but also determine, the type of orchestra you end up using, and the nature of the works to be conducted.
1. Docker Compose + Docker Swarm
This is the system which is included with the open source Docker kit. A DevOps professional will be comfortable with Compose and Swarm because it’s text-based and scripted. It’s a command-line system that gets its commands from scripts.
Compose reads scripts that are rather like recipes. The main script, called simply enough the Docker file, describes how any Docker system should collect the components of an application to create a container image.
Consider the differences here for a moment: A typical VM behaves like a computer. So the only way to produce the image of a VM that’s good enough to be used in production is to run it like a computer, install all the services and resources on it as though it was a computer, then take a snapshot of it, shut it off, and call that snapshot the “golden image.” There are IT professionals for whom this is essentially their entire job.
With Docker, a “golden image” does not necessarily pre-exist. If an application uses Ruby, then Compose reads a recipe that starts with an image of Ruby. It then packs in all the parts of the application that Ruby will interpret and run. This composition takes place every time a container is needed with that image. If an application needs to “scale up,” Compose can put together more images.
Docker Swarm (which the company warns is still officially in beta) uses a different set of scripts to collect together multiple services, and represent their resources to Docker as a single system. This way, entire clusters can be summed together and treated as single units.
While the original Docker tools were designed to run on Linux, next year, Microsoft will integrate them into Windows Server 2016, enabling Windows containers for the first time.
It is Google’s intention to advance Kubernetes so that it is perceived as the de facto orchestration tool for containerized environments, especially Docker. It is the culmination of Google’s own internal project to orchestrate workloads across its own, vast data centers.
Like Docker’s built-in tools, Kubernetes is script-based. But it is somewhat more sophisticated, concentrating first on the integration of data center resources into manageable clusters.
Then it gets down to the business of facilitating microservices — applications built around loosely-coupled components that fulfill their roles through communication with each other. Communication requires a network, so Kubernetes weaves a complex, yet pliable, software-defined network specifically for containers.
Shortly after Docker spearheaded the formation of the Open Container Initiative (formerly “Project”) for the standardization of container formats, Google spearheaded the formation of the Cloud Native Computing Foundation. That move stole much of the wind from the OCI’s sails, re-centering the discussion between vendors around orchestration instead of format — and shining a brighter spotlight on Kubernetes.
3. Mesos + Mesosphere + Marathon
Somewhat before the arrival of Docker on the scene, an open source project called Apache Mesos aimed to simplify the scheduling of workloads (of all kinds) in data centers.
Mesos is, in one sense, an operating system. It’s based on a kernel which schedules applications and manages the environment in which they run. Not altogether unlike the way Hadoop pools together storage and executes tasks in parallel against that storage, Mesos pools together diverse computing systems into clusters, then schedules and executes tasks against those clusters as a whole, including in parallel.
Unlike Kubernetes, Mesos is not centered on OCI (Docker) containers. While Mesos does pool storage, it also brilliantly pools memory (literally, the available RAM scattered among servers) as well as the compute capability of CPUs. As a result, in theory almost any simple relational database manager could become an “in-memory database.”
Enabling Mesos to handle OCI containers was all in a day’s work for its open source contributors.
Just as several new software vendors sprung up around the commercial Hadoop market, a company called Mesosphere was founded to produce a commercial Mesos. But Mesosphere didn’t just shrink-wrap Mesos and stick a price tag on it.
The new company produces a graphical front end for Mesos that is so simple to demonstrate that it leaves people speechless. Several months after Mesosphere’s DCOS premiered in the Linux realm, it was demonstrated to Microsoft developers at Build 2015 in San Francisco.
Their reactions, I’ve been told, resembled the scene where the apes discover the monolith in "2001."
DCOS (Data Center Operating System) is a console that depicts each cluster in a data center as a donut ring. Tasks are color-coded, and the relative time slices devoted to each task consume wedges in each ring. People who did not comprehend what it meant to pool together data center resources and distribute tasks among them, understood it almost instantly simply from reading the DCOS console.
Marathon is the orchestration system Mesosphere created for adapting DCOS scheduling to the purposes of Docker containers. In a move last April to keep Marathon from being perceived as an all-out contender for the container orchestration space, Google reached a deal with Mesosphere that led to Kubernetes being included as an alternative orchestrator to Marathon, in the DCOS package.
This gives customers an “either/or” choice that Mesosphere continues to explain as a purely subjective one based on customers’ needs and expectations. Marathon’s key difference is a tighter fit with Mesos, which Mesosphere has said could be used by developers to their advantage, by automating how Mesos fetches resources specific to applications.
Google has gone so far as to suggest that certain data centers could use both Marathon and Kubernetes, depending on the application at hand.
In Part 2 of this report, we’ll look at other commercial alternatives to these vastly different scheduling systems, some of which try to not appear or act so different in order to be more familiar and comfortable to IT departments skilled with VMs.