My Continuous Testing Journey

Execute automated test suite from Testing IDE ⇒ Build Scripts from Command Line ⇒ CI Server ⇒ CT Server

Nov 15, 2022

Image Credit: http://getdrawings.com/journey-drawing#journey-drawing-35.jpg

My approaches to running all automated UI tests (as regression testing) have gone through the following stages:

2006–2007: From IDE. (here I mean running all regression tests. For developing/debugging individual automated tests, I have always been using TestWise since 2006)
~2007: From the command line (build script)
2007-2008: In a CI Server: CruiseControl
2009–2012: Customized CruiseControl with parallel execution, dynamic ordering, …
2012 — present: In my own Continuous Testing server BuildWise (BuildWise was made available to the public in 2017)

1. Run tests from IDE (2006–2007)

As a programmer (at that time), I naturally tried to run functional UI tests in IDEs (including NetBeans and my own TestWise). However, it didn’t take long for me to realize that it was not right. Functional (UI) tests, compared to unit tests, take a much longer time to execute. For example, a small suite of 20 tests of which each test has an average of 30 seconds execution time, will take 10 minutes to run. This makes it impractical to run functional tests frequently in IDEs.

2. Run tests from the command line (2007)

To free my IDE (to develop code and tests), I started to use build scripts to run tests from the command line. With build scripts, I could easily add customization to test executions, such as updating tests from Subversion and excluding certain tests.

A major drawback of running tests from the command line is “No feedback until it completes”. Therefore, I turned to a Continuous Integration Server.

3. Run in CI Server (2007–2008)

In 2006, there was only one CI server available: CruiseControl developed by ThoughtWorks. I set up a CruiseControl server to run our Watir test suite. Initially, it worked well. The team could trigger a build easily, view changelog, and build results, all on CruiseControl’s web interface. We could act quickly based on feedback.

However, with a growing number of test cases, it was getting harder and harder to pass all tests (a green build). The nature of UI tests, compared to unit tests, is fragile. A single test step failure (maybe due to an issue on the server or the build machine) would fail the whole build. At the same time, the project had become dependent on passing all tests as the gatekeeper, on releasing to the production server. Furthermore, when a build failed, developers were not allowed to check in new features, which would complicate the fixing process.

The team embraced automated regression testing as the benefits were obvious. However, we could not cope with the growing test suites. As a result, the development halted.

4. Customize CruiseControl with parallel execution, dynamic ordering, … (2009–2012)

At that time, I could not find an existing solution to reliably run a set of automated UI tests (in Watir) daily. So I decided to extend CruiseControl (thankfully, it was open-source) with features that might improve execution stability and shorten execution time.

The two most important features I had in mind were:

• distributing automated test scripts to multiple build machines to run them in parallel, which would greatly reduce the execution time,

• auto-retry of a failed test script on another build machine to reduce the fragility of overall test execution.

I came up with a design and customized CruiseControl to support parallel testing and auto-retry. The code was by no means of good quality, but it worked.

With auto-retry and parallel testing, the test execution reliability in CC was greatly increased, and overall test execution time was reduced. Below was a screenshot of 6 builds in 3 days (March 2010) that ran 300 automated UI tests (in Watir) in two machines in parallel. (Machines without assigned user were not easy to get into a government department 10 years ago) Later we increased to 4 build machines to cut the build time to around 75 minutes.

2.1 hour build time running on 2 machines in parallel for 300 Wair tests

The project was a great success (200,000+ test case executions over 14 months).

The team was confident to push the latest green build to production. Looking back, we implemented DevOps more or less in 2009.

5. Create my own CT server: BuildWise (2012- present)

CruiseControl was abandoned soon after ThoughtWorks started to work on their commercial CI product that I did not like. By then, there were already several CI server products on the market such as Hudson (later renamed to Jenkins) and Bamboo. However, those CI servers were suitable for executing unit tests only and lacked the features for long-running and brittle functional tests. Up to now, I haven’t yet seen a single successful Continuous Testing implementation with those CI servers. (My definition of Success CT: at least Level 2 of AgileWay Continuous Testing Grading, i.e., run 50+ automated UI tests reliably daily).

Under these circumstances, I decided to create my own Continuous Testing Server with built-in support for the features I added to CruiseControl, and more. Long story short, I developed BuildWise and have started to use it for my own software development since 2012. As of 2021–01–04, the total number of user story level test cases (in Selenium WebDriver) for ClinicWise (one of my own web apps) was 608, with over 600,000 test executions over the last 7 years.

I repeated the process for another app: WhenWise.

A recent CT build for WhenWise (my other app)

Our CT process enabled us to respond to customers’ requests promptly, 95% of customers’ feature requests or reported defects were implemented overnight and on the production server the next day. By the way, at AgileWay, we have never used a defect tracking system (Note: I am not totally against DFS. We just never had the needs in our case, as we are efficient with replicating issues into automated tests and solid regression testing with a good CT process), and probably never will.

Some of our test automation consulting clients have used BuildWise to replace their failed CI servers such as Jenkins and Bamboo (I am not saying these CI definitely won’t work for executing UI tests, only that I haven’t seen one successful case yet. It certainly is possible, like I customized CruiseControl back in 2009. Of course, I rejected clients’ requests for me to implement for Jenkins, as I had BuildWise which (server) is free and open-source). In 2018, BuildWise won the runner-up prize of the prestigious Ruby International Award, judged by Matz, the software legend and the creator of Ruby.

The Agile Way

Discussion about this post