Posted in DevOps

The DevOps Process and Continuous Delivery (CD) – An Overview


This guest post by Joakim Verona, the author of Practical DevOps, Second Edition, gives a concise overview of the DevOps process and continuous delivery.

When you work with DevOps, you work with large and complex processes in a large and complex context. An example of a CD pipeline in a large organization is depicted in the following diagram:

Practical DevOps

While the basic outline of this diagram holds true surprisingly often, regardless of the organization, there are, of course, differences, depending on the size of the organization and the complexity of the products that are being developed. The early parts of the chain, that is, the developer environments and the CI environment, are normally very similar.

The number and types of testing environments vary greatly. The production environments also vary greatly. This article will discuss the different parts of a CD pipeline.

The developers

The developers (on the far left in the above diagram) work on their workstations. They develop code and need many tools to be efficient.

Ideally, they would each have production-like environments available to work with locally on their workstations or laptops. Depending on the type of software that is being developed, this may actually be possible, but it’s more common to simulate or, rather, mock the parts of the production environments that are hard to replicate. This may, for example, be the case for dependencies such as external payment systems or phone hardware.

When you work with DevOps, you may pay more or less attention to this part of the CD pipeline, depending on which of its two constituents you’ve emphasized in your original background. If you have a strong developer background, you will appreciate the convenience of a prepackaged developer environment and work a lot with those.

This is a sound practice, since developers may otherwise need to spend a lot of time in creating their development environments. Such a prepackaged environment may, for instance, include a specific version of the Java Development Kit (JDK) and an integrated development environment, such as Eclipse. If you work with Python, you may package a specific Python version together with the specific versions of dependencies you need for your project. The Python communities maintain several tools that do just this, such as Virtualenv and Anaconda.

Keep in mind that you essentially need two or more separately maintained environments. The preceding developer environment consists of all the development tools you need. These will not be installed on the test or production systems. Furthermore, the developers also need some way of deploying their code in a production-like way. This can be a virtual machine provisioned with Vagrant running on the developer’s machine, a cloud instance running on AWS, or a Docker container; there are many ways to solve this problem.

It is recommended that you use a development environment that is similar to the production environment. If the production servers run Red Hat Linux, for instance, the development machine may run CentOS Linux or Fedora Linux. This is convenient because you can use much of the same software that you run in production locally and with less hassle. The compromise of using CentOS or Fedora can be motivated by the license costs of Red Hat and also by the fact that enterprise distributions usually lag behind a bit with software versions. If you are running Windows servers in production, it may also be more convenient to use a Windows development machine.

The Revision Control System

The Revision Control System (RCS) is the heart of the development environment. The code that forms the organization’s software products is stored here. It is also common to store the configurations that form the infrastructure here. If you are working with hardware development, the designs may also be stored in the RCS.

The following diagram shows the systems dealing with code, CI, and artifact storage in the CD pipeline in greater detail:

Practical DevOps 2

For such a vital part of the organization’s infrastructure, there is surprisingly little variation in the choice of product. These days, many use Git or are switching to it, especially those using proprietary systems that are reaching end-of-life.

Regardless of the RCS you use in your organization, the choice of product is only one aspect of the larger picture. You also need to decide on the directory structure conventions and the branching strategy to use. If you have a great deal of independent components, you may decide to use a separate repository for each of them.

The build server

The build server is conceptually simple. It may be seen as a glorified egg timer that builds your source code at regular intervals or on different triggers.

The most common usage pattern is to have the build server listen to changes in the RCS. When a change is noticed, the build server updates its local copy of the source from the RCS. Then, it builds the source and performs optional tests to verify the quality of the changes. This process is called Continuous Integration (CI). Unlike the situation for code repositories, there hasn’t yet emerged a clear winner in the build server field. Nevertheless, Jenkins is one of the most widely used open source solutions for build servers; it works right out of the box, giving you a simple and robust experience, and is fairly easy to install.

The artefact repository

When the build server has verified the quality of the code and compiled it into deliverables, it is useful to store the compiled binary artefacts in a repository. This is normally not the same as the revision control system.

In essence, these binary code repositories are file systems that are accessible over the HTTP protocol. Normally, they provide features for searching, indexing, and storing metadata such as various type identifiers and version information about the artefacts.

In Java, a pretty common choice is Sonatype Nexus. Nexus is not limited to Java artefacts, such as Jars or Ears, but can also store artefacts of the operating system type, such as RPMs, suitable artefacts for JavaScript development, and so on.

Amazon S3 is a key-value datastore that can be used to store binary artefacts. Some build systems, such as Atlassian Bamboo, can use Amazon S3 to store artefacts. The S3 protocol is open, and there are open source implementations that can be deployed inside your own network. One such possibility is the Ceph-distributed file system, which provides an S3-compatible object store.

Package managers, explored in the next section, are also artefact repositories at their core.

Package managers

Linux servers usually employ systems for deployment that are similar in principle but have some differences in practice.

Red Hat-like systems use a package format called Red Hat Package Manager (RPM). Debian-like systems use the .deb format, which is a different package format with similar abilities. The deliverables can then be installed on servers with a command that fetches them from a binary repository. These commands are called package managers.

On Red Hat systems, the command is called yum, or, more recently, dnf. On Debian-like systems, it is called aptitude or dpkg.

The greatest benefit of these package management systems is that installing and upgrading a package is fairly easy—dependencies are installed automatically.

If you don’t have a more advanced system in place, it would be feasible to log into each server remotely and then type yum upgrade on each one. The newest packages would then be fetched from the binary repository and installed. Of course, as you’ll see, there are indeed more advanced systems of deployment available, which is why you won’t need to perform manual upgrades.

Test environments

After the build server has stored the artefacts in the binary repository, they can be installed from there into test environments. Test environments should normally attempt to be as production-like as possible. That’s why it is recommended that they are installed and configured with the same methods used in production servers.

Staging/production

Staging environments are the last line of test environments. They are interchangeable with production environments. You can install your new releases on the staging servers, check that everything works, and then replace your old production servers with the staging servers, which will then become the new production servers. This is sometimes called the blue-green deployment strategy.

The exact details of how to perform this style of deployment depends on the product being deployed. Sometimes, it’s not possible to have several production systems running in parallel, usually because production systems are very expensive.

At the other end of the spectrum, you may have hundreds of production systems in a pool. You can then gradually roll out new releases in the pool. Logged-in users stay with the version that is running on the server they are logged into. New users log into servers running newer versions of the software.

Not all organizations have the resources to maintain production-quality staging servers, but when it’s possible, it is a nice and safe way to handle upgrades.

If this article piqued your interest in DevOps, you can refer to the book, Practical DevOps, Second Edition, by Joakim Verona. The book is a must-have for developers and system administrators looking to take on larger responsibilities and understand how the infrastructure used to build today’s enterprises works.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s