byaluciani06-24-201606:58 AM - edited 06-28-201605:02 AM
When Nutanix entered the market seven years ago, the dominant trends in the datacenter were virtualization, flash, and Big Data. We set off on a journey to marry all three, and the Nutanix Distributed Storage Fabric was born. Over the last seven years, the datacenter has evolved dramatically, particularly with respect to simplifying operations and increasing speed and agility. The container movement, led by Docker, has taken off because it addresses all three. While the IT operations team drove virtual machine adoption, developers are driving container adoption in the name of "agility." For many, virtualization is now seen as legacy but container technologies are not directly comparable to virtual machines and in many respects far less mature.
Container technology offers many benefits, such as improved application portability across development, test, and production environments, without regard for whether the host is your laptop, an on-premise datacenter, or in the cloud. Containers consume a small fraction of the compute resources of typical virtual machines, allowing for near-instant start times, application scaling, and increased application density, thereby saving customers time and money. Containers have come a long way in a remarkably short time and are now considered the de facto method for facilitating deployment, one of the costliest areas of software development.
To say the industry is excited about Docker is a huge understatement. After a few years of experience with Docker here at Nutanix, we can understand why: it provides an effective way to build, package, and distribute software. As Docker and the container ecosystem develop, it is clear that, in many respects, this movement both complements and competes with virtual machines. However, industry transitions take time and this applies even to an industry leading company, such as Docker. Customers and developers in the community have shared with us a range of issues, which we describe below, especially around problems with data persistence. In this blog, we talk about how we address several of these issues via the Acropolis Docker Volume Plugin, available with AOS 4.7.
Many Obstacles on the Path of Containerization
The container market remains highly fragmented. Customers are at different stages of adoption, with the vast majority in Dev/Test and very few organizations are running containers in production. The reason for this may be that Docker in development is two commands, usually a webapp linking to a database running in a VM, or one command if you’re using Docker Compose. Docker in production, however, is much more complex and not for the faint of heart.
Containers In Production Is Hard!
Following are questions I regularly hear from customers and users in the field (events, meetups, etc) that illuminate the challenges that Docker users wrestle with:
How are you persisting data between containers?
Where am I logging the output of all my containers?
How do I deploy Docker images across N machines?
How do I roll back quickly if a push was bad?
How do I automatically build and test containers?
There was a security patch for the base OS my container uses. Does my process depend on FROM a little too much?
Wait, why am I using a public base container image?
Why are my containers not communicating with each other? Did I configure my overlay correctly?
How many servers do I need to test a build quickly enough so I'm not getting yelled at by developers?
How am I collecting metrics from each container?
How are those metrics registering their time series with my DB?
What software am I actually using to extract metrics from those containers?
How am I ensuring a specific version across all my running containers?
Do I have a proper staging environment that's accepting production traffic?
How am I load balancing between hosts and containers on hosts?
How am I debugging connection issues inside a container? I don't even have netstat?
Have I load tested that logging system to auto-scale more than 10x on a weekend?
Oh crap, one of the five distributed systems my scheduler requires exploded, causing a cascade of failures. (Ok, that one’s not really a question).
How do I recover my secrets when my container died? Why did I put secrets in a container?
Unlike VMs, Docker containers are transient in nature, as is the storage assigned to them. When a Docker container goes away, the storage goes away with it. Much of the power of containers comes from the fact that they encapsulate as much of the state of the environment’s filesystem as is useful. As a result, when you restart a Docker container, the new container retains none of the changes made in the previously running container—those changes are lost. Yes, your data is lost!
Docker containers were initially developed for 12-Factor applications, an application development pattern pioneered by Heroku, an early PaaS provider. This 12-Factor application method says that the state of your application should be outside of the application and in a datastore (database, queuing, caching, etc). This pattern was great at the time (2011) if you were running only front-end applications, because it’s not a problem when the containers go away. But what about the web cache of your web server, or the application logs that you might want to keep for an audit trail? What about secrets (SSH keys, certificates, password files, database credentials, etc.)?
You could use Docker Volumes (a bind mount between the container's file system and the host's filesystem) to store data outside the container's own union file system to persist data on the local host's file system. However, this binds the volume to a host, which is a single point of failure. Of course you can restart your container on another host, but your data does not move with it, and if the host or VM goes away your data is lost.
Containers Have Sluggish I/O Performance
I/O isn’t great on containers and Docker recommends using volumes. By default, containers utilize a union file system mount that brings copy-on-write capabilities at a container level. This is Docker’s secret sauce; it is what provides its git-like capabilities. However, Docker volumes bypass the union filesystem and are initialized during container creation. If you want your data to persist outside the container lifecycle, you need to use a Docker volume.
The downside of Docker volumes is that your data is lost if the Docker host dies. Docker introduced the concept of “volume plugins” with Docker 1.8 as a way to facilitate communication between Docker and storage APIs. This allowed third parties to integrate with Docker, but the level of integration was very basic. They then added “named volumes” with 1.9, which allowed users to manage volumes as atomic units instead of managing them inside a container. We can now provide highly available storage services to containers that reside outside the container and the host.
Introducing the Acropolis Docker Volume Plugin
With AOS 4.7, we have extended the Acropolis DSF (Distributed Storage Fabric) to provide persistent storage support for containers. The Nutanix Acropolis DSF Volume Driver is written in Go and runs as a Docker volume extension. It behaves as an intermediate container (runs in privileged mode), effectively a Sidekick container. The Nutanix Acropolis DSF Volume Plugin surfaces a link to the Acropolis DSF via iSCSI volume groups, exposing Acropolis DSF storage directly to containers and bypassing the hypervisor.
Thus when you lose a container or the host you still have access to your data. Moreover, because we leverage data locality, you get the added benefit of data mobility, which means the data volumes always follow the container as it moves across the entire cluster and to the host where the container is running. These features are unique to Nutanix. This form of data locality is the secret sauce that enables us to guarantee consistent performance against the noisy neighbor problem no matter the size of the Nutanix cluster. Another goal when developing support for containers was to bring the consumer grade UX that customers demand and expect of us. In our experience, early adopters of new technology already have a steep learning curve, so we did not want to be a choke point. Our solution does not require 3rd party drivers or tools; it’s native in our implementation.
As you progress in your journey of moving containers from Dev/Test to production, you can implement more advanced data services like disaster recovery, scheduled snapshots, real-time tiering (HDD, SSD, NVMe, or RAM), compression (inline/post-process), deduplication (inline/post-process), and erasure coding.
Container Persistence Models
Nutanix Docker Volume Driver
The Nutanix Acropolis DSF Volume Plugin for Docker and the Docker Machine Driver provide the following benefits to help our customers increase their pace of innovation and time to market:
Container-Native Integration. DSF and Docker Machine use the native Docker API and tooling.
High Storage Performance and Throughput. The Docker Volumes Plugin uses the best of breed Distributed Storage Fabric. Nutanix DSF performance scales linearly.
Easy Install and Support. The DSF Docker Volume Plugin and the Docker Machine Driver work right out of the box and are fully supported by our award-winning support organization.
A Nutanix and Docker solution provides two great advantages, the web-scale IT of Nutanix alongside the speed and agility of Docker. Nutanix addresses the issues of data persistence and storage performance while complementing the speed and agility of Docker by making IT Infrastructure invisible. For all the work that remains to be done with this relatively new platform, Docker has a lot to offer, which is why it is changing the way applications are being built. We have only just begun bringing HCI to containers, so stay tuned for more to come.
Docker’s rapid build, ship, and run paradigm has realized the principal DevOps ideals of consistency, simplicity, and automation. Virtualization can enhance the DevOps benefits of containerization even further. Running containers on the Nutanix hyperconverged platform has many advantages and brings DevOps-style management to your virtualization infrastructure.
DevOps as a Technology Driver
We’re hearing more and more about containers as many enterprises are transforming their IT development processes to focus on DevOps. Traditional development models that create applications and then hand them over to operations are more difficult to maintain, less reliable, and slower than modern businesses can endure. DevOps is a culture and set of methods around integrating development and operations organizations into more cohesive structures that can rapidly deploy reliable applications.
From a cultural and structural standpoint, DevOps encourages IT to adopt three main concepts, as adapted from Gene Kim:
Systems thinking: approach development and operations as a giant system and account for everything in the system during build and deployment. This approach incorporates the entire business value stream and avoids separating technology from the business itself.
Feedback loops: create feedback loops that amplify the good and promptly eliminate the bad.
Continual experimentation and learning: understanding complete systems with tight feedback loops allows you to take incremental risks, fail quickly, and recover without delay. These short cycles facilitate constant improvement while minimizing overall risk.
DevOps technology focuses on these areas:
Automation: automating maintenance, provisioning, and orchestration to improve speed, consistency, and fault avoidance.
Source control: the ability to reproduce the same application repeatedly, over time, and the ability to back out changes and revert to a known good state.
Monitoring and dashboards: assessing the complete state of an environment: what's deployed, how it's performing, and its security status.
The combination of structure, culture, and technology in DevOps seeks to create reliable environments that are capable of continuous delivery. That is, instead of the large release and deployment cycles in traditional environments, a DevOps environment should be capable of generating small releases at high frequency. In some of the best examples, it's even possible to have dozens of updates a day.
DevOps and Docker Complement Each Other
Docker containers enable the key cultural and technological shifts that characterize DevOps:
Systems thinking: Containers reduce the number of variables in deploying an environment; this makes the overall delivery and production system less complex and easier to manage.
Continual experimentation: Containers allow you to roll out incremental changes and, if necessary, quickly revert to a known good state.
Continuous delivery: The ability to rapidly create, modify, deploy, and destroy containers means that you can implement an application update quickly in a new container, which can start seconds after you shut down the previous one.
Automation: It’s easy to automate Docker images and files. Images reside in a repository and can be pulled as needed. Dockerfiles are human or machine editable and automate the commands used to create a container. Docker Machine lets you automatically create and configure Docker hosts and supplies various commands for managing them.
Source control: Automated images and container files provide source control and revision history for containers.
Docker supports DevOps by ensuring uniformity in OS and application environments across development, QA, and operations. The QA person can easily detect whether an application has an issue, because QA occurs in a container identical to the one used in the development environment. The operations process is streamlined in the same way. A new production container is simply another instance of the same container used by QA and development. This automated consistency across all environments supports a higher level of trust in the organization and can serve as a core building block for a DevOps shop.
Nutanix Provides a Foundation for Both Docker Containers and DevOps
The DevOps and Docker themes are fundamental aspects of the Nutanix platform as well. Virtualizing containers may seem counterintuitive at first, since containers are themselves a logical partition of a running OS. But since the machines running Docker must be installed and maintained like any other equipment in the datacenter, managing them as virtual machines becomes a practical solution.
While the white box approach has been common in the large, internet-scale environments where Docker first took root, many enterprises run application environments at a smaller scale that makes managing single-purpose platforms an inefficient choice. If you think of containers as part of the application stack, then it becomes clear that virtualizing host OS containers produces the same benefits as all server virtualization: VM-level high availability, disaster recovery, resource scheduling, and fast provisioning The new container paradigm benefits from these foundational technologies provided by the more mature virtualization ecosystem.
A container must also exist on an operating system that you have to manage. If the Docker engine’s host OS exists on a physical server, that OS must be installed in a consistent way on each machine. This requirement necessitates a machine-level configuration and maintenance regime that must be performed against a live operating system on live hardware. Likewise, variance in the hardware itself can affect container performance. When the container is running in a virtual machine, the attributes you set, like the number of vCPUs, or amount of RAM, are hardcoded into the VM and remain consistent across different physical servers.
Similarly, when Docker containers exist inside VM instances, those VMs can be cloned and replicated like any other VM. This also reduces the operations overhead of managing the host OS across potentially hundreds or thousands of machines. The host OS for the container becomes a “golden image,” and you can use a clone of that VM, running on the same hypervisor and the same Nutanix platform, throughout development, QA, and release. Integrating cloned VMs and Docker images ensures absolute consistency at every level of the stack.
At the most basic level, the Nutanix invisible infrastructure takes care of hardware installation and management and provides the high performance benefits of the Nutanix Distributed Storage Fabric (DSF) to every node. Nutanix Prism is a built-in, best-in-class infrastructure management tool that operates across the entire system and makes monitoring the environment and identifying hot VMs utterly transparent. You can still monitor individual applications at the container level, and Prism tracks performance and health at the infrastructure level, so you get a complete picture without a separate management stack.
Integrating container VMs with your other enterprise VMs on a Nutanix cluster yields higher efficiencies across the entire virtualized environment—there’s no need to keep container systems separate from the rest of the server plant. When the Nutanix system is projected to run out of compute or storage resources, simply add more nodes to scale the cluster seamlessly and redistribute VMs to the new nodes. This kind of consistency at every level of the infrastructure on a Nutanix platform reinforces a DevOps-style environment and makes it part of a larger enterprise infrastructure.
For more on Docker on the Nutanix Invisible Infrastructure, check out our Docker Containers on Nutanix Best Practices Guide:
Welcome to this episode of the Nutanix NEXT Community Podcast. This weekly podcast is an informal and technical look at topics around Nutanix, web-scale, IT, and the online IT community. You can think of it as a campfire and coffee shop for the extended Nutanix community.
This week we chat with Dwayne Lessner about his new course from Udacity called - Introduction to DevOps