Continuous Integration Illustrated

CI illustrated.

Nov 15, 2022

When I tried to help my team members understand Continuous Integration (CI) 14 years ago, I created a set of diagrams for a quick overview of CI and its steps. I want to share them in this article.

Continuous Integration, simply put, is to run a server to check the source control for new changes and then run a series of build tasks. Some common CI build tasks are listed below in the order of execution:

Table of Contents:
· 1. Source Control Update
· 2. Database Migration
· 3. Compile
· 4. Unit Testing
· 5. Code Coverage
· 6. Package
· 7. Deploy
· 8. End-to-End Testing
· 9. Source Control Tagging
· 10. Publish

Here is a diagram:

Though the above diagram was created over 14 years ago, I believe it is still valid today.

As you can see, a CI process consists of 10 steps. A failure in even one build step (return non-zero from the execution) will exit the whole build process: failure. In other words, a good build must pass all the specified build steps.

What I listed are just common CI steps and many of which are optional. For example, the ‘Compile’ step is unnecessary for Ruby on Rails projects. A CI process may also include other build steps, such as performance testing or API testing.

I would highlight three difficult CI Steps (in ascending order of difficulty): “Database Migration”, “Unit Testing” and “Functional UI Testing”. I trust few people will deny the needs of these steps, though most of them have probably never seen them done correctly.

1. Source Control Update

Every CI build shall be run against the latest source (code and tests).

Performing a source update (such as git pull) is a built-in CI feature, and you don’t usually need to do anything for this step, except install the source control command line (or called client) tools.

Git has become the de-facto standard for version control. For people who are new to version control, check out my other article: 10-Minute Guide to Git Version Control for Testers.

Update

Make sure git fetch or git pull works from the command line, without the need to authenticate.

2. Database Migration

Database schema changes, such as adding a column to a database table, shall be systematic. The best way to verify (yes, we need to do that) is to run database migrations as a part of the CI process. Manual change to the database, such as editing the database schema directly in SQL Server Studio, is unprofessional and error-prone.

3. Compile

For software written in a compiled language, such as Java or C#, compile the latest code.

4. Unit Testing

Unit testing, as the key concept of Test Driven Development, helps programmers to produce robust code, more importantly, better-designed code. If a piece of code is hard to write the unit test for it, its design is most likely not optimal.

Programmers who claim ‘refactoring code’ without a suite of unit tests really are ‘cowboys change with hope for good luck’. No unit tests, no code refactoring.

Check out my other article: Unit Testing Clarified.

5. Code Coverage

Code Coverage measures how much code is covered by unit tests, a good objective indicator to be included in CI for software project enforce Test Driven Development (TDD).

6. Package

A software release package typically contains compiled code, configurations, web pages (with CSS and JavaScript), file templates, …, etc. This step is mostly concerned with how to package files using build scripts into a specific format. For example, a war file is a zipped file format used for web applications developed in Java.

7. Deploy

The target of deployment in a CI process is the test server instance for running automated End-to-End tests.

Aiming for invoking just one command that does all the deployment. Some might have doubts, like me in 2005, it is possible. Since then, I have been using just one command to deploy all my web apps.

Update:

Deployment needs to be simple, reliable, and quick. With the popularity of cloud deployment, new deployment technology emerges such as Chef, Docker/Kubernetes containers. Unfortunately, many DevOps engineers (by the way, I think it is a wrong title for a person who solely does deployment, as deployment is only counted for about 5% of DevOps work from my experiences) don’t use them well. More often than not, they make deployment over-complicated, and as a result, fragile and slow.

In 2019, all projects I witnessed using Docker/Kubernetes containers were not good. One was particularly bad, I have never seen a deployment process that was so fragile (and slow) in my over 20 years of IT career (even worse than the dark days using EJB containers). The test servers (a batch of containers) can barely function properly for one day. But I did learn one thing new: “run out of inodes” error (I learned from the Operating System course at Uni) could actually happen.

I am not against new deployment technologies, given they can indeed increase productivity and simplify the work. If the end results are completely opposite to your goal, stop and revert it back until you find the right person who can actually do it properly. There are plenty of costly lessons in the software industry of blindly following new hypes.

Check out my other article: Deploy Wisely with StackScript, Say No to Azure/AWS.

8. End-to-End Testing

Executing automated functional tests against the test server(s) with a new version of software deployed (by the last step), essentially, Continuous Testing.

Why do this step? The ultimate purpose of CI: if pass all steps, we could push the new version to production.

Some might not agree with the difficulty (very hard) I rated there.

I would say we might have different perspectives. Most UI testing, if present, in CI is no more than smoke and mirrors. For me, CT is the core of software development. If all automated functional tests pass in CI, this build will be released to production. (For nearly all user stories and customer-found defects, I have automated tests for them).

Let me illustrate it with an example. Let’s say that we have 200 user-story-level functional tests written in Selenium WebDriver, on average, one test case has 30 test steps (each step represents a user operation, such as entering text, clicking a link and verifying certain text) and execution time of a single test is 30 seconds. There are 30 x 200 = 6000 test steps, and a full regression testing will take 6000 seconds. To get a green build (all tests pass), each of every 6000 test steps needs to pass within nearly 2 hours of test execution. A single failure results in a broken build.

Now, do you agree this is a very, very hard task?

9. Source Control Tagging

If want the ability to revert to a specific build, enable this in CI.

10. Publish

Notify the team about the build outcome.

Update:

I don’t recommend posting build results on a team’s Slack or MS Team’s channel (which is very easy to configure CI servers to do), as it would cause distraction. If you have a comprehensive automated End-to-End suite, the chances of having false alarms (not real defects, mostly caused by not-well-written tests and unreliable infrastructure) are high.

DevOps engineers need to manually inspect build failures, and make sure they are not real defects. Only then, they can notify the team in a formal channel.

The lava Lamp notification system is good, it is not intrusive and gives some time (warming up the wax) for DevOps to verify before the ‘lava’ becomes active.

The Agile Way

Discussion about this post