monkey screaming in a cave
PHOTO: Asa Rodger

One look at the landscape for Continuous Integration/Continuous Delivery or Deployment (CI/CD) products and something quickly becomes obvious: With few exceptions (GitLab comes to mind), finding products that cover the majority of the spectrum of CI/CD is a tough job. There are products that help manage, version, and store code, create builds, package the code for deployment, scan for security defects, and deploy to test, production staging, and finally production bare metal, virtual machines, and containers. Rarely does any one product cover all the phases of CI/CD.

Instead, a CI/CD pipeline usually encompasses many products, integrated by individual IT departments, and configured with code developed in-house. In some complex CI/CD pipelines, as many as 20 products may be present, with integration points developed using a multiple of APIs. 

The sheer complexity can be overwhelming.

The biggest problem comes when you try to manage projects based on such a complex system. It’s tough enough managing Agile development where burn down rates and technical debt are often based on subjective information. Now, add to that the incompatible data from a large number of CI/CD components and it’s a wonder IT management doesn’t go mad just trying to describe where they are in a process, let alone understand what needs to change.

Related Article: Version Control Systems: The Link Between Development and Deployment

Establishing Metrics in CI/CD Pipeline Tools

Here lies the grand conundrum. Complex projects need solid metrics to keep on track, but the complexity of the system makes it hard to get those metrics. The good news is CI/CD products produce a lot of real data, removing much of the subjectivity of development metrics. That gives IT managers a place to start.

Some ways to make use of that data are:

  • Big Data. CI/CD systems can produce a lot of uncoordinated data. Different products with diverse data structures produce both an embarrassment of riches and a tough data management problem. This issue can be dealt with in the same manner as reams of incompatible marketing, finance or other business data: using a big data approach. The same data warehousing, Hadoop clusters and Apache Spark-based systems used to normalize and find insights in other business data can be deployed to understand multiple, concurrent and complex CI/CD pipelines.
  • Focus on specific but meaningful metrics. If a big data approach is out of reach resource-wise or just seems like overkill, it helps to understand the key choke points and important metrics that have broader effects on CI/CD pipelines. For example, it makes sense to track and classify the failed builds and deployments or watch manual interventions.
  • Track historical data. While it makes sense to manage the CI/CD process while it’s happening, tracking historical data is equally important. Historical data will often reveal trends and common problems that point directly toward future process improvements and areas where automation will have an impact.

Related Article: This Is How We DevOps

Making Sense of Complex Systems

These are just a few ideas. As more CI/CD products begin to cover a wider swath of the pipeline, access to consistent data will make managing CI/CD processes much easier. The emergence of CI/CD analytics, including AI-infused systems, will also ease the burden of handling multiple pipelines simultaneously. Until then, IT can use the tools already at its disposal to make sense of these complex systems, just as they do for all the other systems they manage.