My Innovative Solution to Continuous Testing: Parallel Automated End-to-End Test Execution
A case study of how I implemented parallel test execution for Automated End-to-End tests (via UI) in 2008.
This is included in the “My Innovative Solution to Test Automation and Continuous Testing” series.
Table of Contents:
· Challenge: Automation Regression Testing’s Growing Time
· The idea of Parallel Test Execution
· Customize CruiseControl
· Parallel Testing Tab
· Stats
· Birth of BuildWise
On 2022–08–31, Apple announced Xcode Cloud, a continuous integration and delivery service. One of its highlighted features is “run automated tests in parallel”.
I came up with the idea of parallel execution of automated end-to-end tests and implemented it in a CI server back in 2008. At that time, to my knowledge, there was no other doing it.
14 years ago, there were only two CI servers existed: CruiseControl (by ThoughtWorks) and Hudson (predecessor of Jenkins). CruiseControl was more well-known in Australia back then. In terms of test execution, these two CI servers were designed for executing only Unit and Integration tests (in JUnit).
Challenge: Automation Regression Testing’s Growing Execution Time
I set up a CruiseControl server for the project to run our Watir test suite. It worked quite well for a month or two. The team could trigger a build easily, view changelog, and build results, all on CruiseControl’s web interface. Then, we could act quickly based on testing feedback.
The team members initially were in doubt, but later on, all embraced automated regression testing for its obvious benefits, the most valuable one: many regression errors were detected early. (these Java programmers started to learn and love Ruby). Therefore, the number of test cases grew quickly. With more tests being created, there came longer test execution time. At first, I didn’t pay much attention to this situation. Anyway, it was my first test automation attempt with a considerable number of Automated End-to-End Tests with GUI (previously, using HTMLUnit).
The execution time of the regression suite kept incrementing, 20 mins, 30 mins, 40 mins, 50 mins, …, etc. It did not take long for everyone in the team to realize that we could not cope with the growing test suites. One failed test execution meant another run of all tests (~1 hour). The whole team’s development was heavily dependent on automated regression testing (in CruiseControl), but we couldn’t afford to wait that long every day.
Initially, we tried ignoring some test failures and checked in the code anyway in the hope of resolving the issues at the end of the day. It proved a bad idea soon. The long (compared to before) feedback time made bug-fixing a much more challenging and time-consuming task. Subsequently, our development halted. As a tech lead and a software engineer, I decided to do something about it.
The idea of Parallel Test Execution
I did some research on the Internet but could not find an existing solution to reliably run a set of automated UI tests (in Watir) daily. On a few blog posts, I found some ran automated End-to-End regression tests in CI, but no one mentioned the long-feedback challenge. I guessed they were probably struggling with the first hump of test automation: hard to maintain (with the number of tests reaching about 20), which our team had overcome thanks to maintainable automated test design and functional test refactoring support in TestWise.
Therefore, there was no previous work that I could refer to (there might be but I couldn’t find it then). One day, the idea of Parallel Execution came to me, thanks to my study and previous working experience:
two years of study for a Master’s degree of Information Technology in Distributed systems.
3.5 years of work at the Distributed System Technology Center.
One of the courses I studied was “Parallel Computing”, and we used the state’s super-computing lab to do some parallel computing tasks. Though I forgot most content of the parallel computing course, I remembered one thing clearly: parallel computing saves time. By executing tests on three machines in parallel, we could save 66% of the total time. ( The exact 2/3 save is not possible, as there would be overhead which I realized later)
So, the direction was set.
Customize CruiseControl
The next correct decision I made was to work on the CI server, not the test scripts. Some readers might have heard of some so-called new test automation framework’s built-in Parallelism feature, such as Playwright, which I think is wrong. At that time, our business analysts/manual testers had already started using test automation scripts. I wanted the test scripts as simple and clean as possible, as Parallel Test Execution shall not be confused with the test script itself.
CruiseControl was open-source (in Java), and I had experience patching open-source Java libraries before. So, I read the CruiseControl source code and started adding the parallel execution part.
The main work was actually on the agent software, which would be installed on multiple machines. This meant that I needed to write a new Application to communicate with the CruiseControl server. An agent software would mainly do the following tasks:
Get test scripts from the CI server
Execute the test script
Send the test result back to the CI server
Like many things in life, once the direction was right and the determination was set, the task was not as complex as I had previously thought. I got it working with only a few weeks of work at night.
Parallel Testing Tab
With the PM’s permission, I set up the customized CruiseControl server + my build agent software at work. We started with two agent machines, which cut down the execution time by about 40%. That was very encouraging!
Some readers would probably wonder: why would you add more build machines immediately? But we couldn’t. (Money is not the problem for purchasing extra machines, but the procurement process was. This was a government project). Please remember we were in the year 2008 then. Amazon AWS (EC2) was officially released on October 23, 2008. Many of us in Australia were only aware of Cloud-computing a year later after that.
Back then, our build machines could only be from our existing machines, i.e., the machines that our team members were using. My solution was to install the agent software on everyone’s machine; we scheduled runs at lunchtime and after work (5:30 PM).
The term build agent is what I am using now. Back in 2008, I didn’t name the agent application. Instead, I asked the team for name suggestions. Eventually we agreed on: “Build Slave”, which revealed the Slave-Master relationship to the CruiseControl server. After a team member came back from holiday, some would told them: “Zhimin slaverized your machine”.
When a team member was leaving, our PM would try hard to find a reason to keep the machine. Towards the end of the year, we had a relatively stable parallel-testing lab:
1 Customized CruiseControl Server + 4 Build Agent Machines
Stats
There were some stats of the project using parallel testing in 2008–2010. (2008 data were wiped due to infrastructure changes)
200,000+ Test Case Executions in 14 months
Automation Regression Suite: 250 test cases in Watir
30 mins full build time on 4 build machines
5–7 developers/testers/BA
Lava Lamp integration for build status indication
(more on this in a separate article)
The Birth of BuildWise
The above-mentioned project ended in 2011. Six months prior, I was “borrowed” to work on a ‘sister’ project, and I brought this software package with me, which worked on the new project with great success as well. I knew my former colleagues were still using this customized CruiseControl on a daily basis. The whole software package was discarded when we all left this government department.
When I look back now, I probably should have written an article about my solution, and it might have helped some other projects.
ThoughtWorks abandoned CruiseControl in 2011 (after releasing its commercial CI Server: Go), which meant that my customized solution was (or about to be) obsolete anyway.
Based on my past experience, I started designing a new Continuous Testing server from scratch using my favourite language: Ruby. I did this for the following reasons:
Polish my Ruby skills,
I believe Ruby is a good language for coding a CI/CD server, and I could create a more user-friendly (in terms of easy-to-use) one.
More importantly, I have the demand myself.
I named it BuildWise. BuildWise server is free and open-source. I first started using it in 2012 for my own app development. Along the way, I have added many new features (which I will cover in separate ‘My Innovation ..’ articles).
In 2008, BuildWise won the second prize in the Fukuoka 2010 Ruby Award, reviewed by Matz, the software legend and the creator of Ruby. According to my interpreter's translation of the reward speech, Matz praised “BuildWise is very innovative!”
FAQ
1. Is Test Script Maintenance the №1 Challenge in Test Automation?
Yes, it is true. But we have overcome that with Maintainable Automated Test Design and the use of TestWise, especially, its functional test refactoring support.
Check out my other article: “Is Your Test Automation on Track? Maintenance is the key”
2. In terms of test execution, I think stability is more of a concern than the overall execution time. Don’t you agree?
Yes or No. Of course, the test execution needs to be highly reliable, otherwise, all other work will be in vain. There are various techniques for stabilizing automated tests, check out the following articles:
For the project in 2008, our team had already overcome the stability issue. The long feedback loop was the last challenge. For more, check out my article: Test Automation Camel, a metaphor that explains why most test automation attempts failed?
Also, back in 2008, JavaScript or AJAX was not used heavily in web apps. Generally speaking, test execution was quite stable. Even so, we handled it well with this approach: Test AJAX Properly and Efficiently with Selenium WebDriver, and Avoid ‘Automated Waiting
3. Have you done similar customization for Jenkins?
No, I did receive the request a few times, but I declined. BuildWise is already there, free, open-source, and international award-winning. Why bother?
Of course, I understand the logic behind it. The senior engineers are resistant to introducing a new CI/CD server. But BuildWise is really a CT server (for executing end-to-end UI tests) and can coexist with Jenkins or Bamboo (for executing unit/integration tests). If those engineers could not realize that, I don’t think they would really appreciate the features I invented anyway.
Along with my declination, I often said: “For Jenkins experts, it shall not be that hard to implement the CT features you found in BuildWise (open-source). I could help explain how they were designed”. No single software architect or principal software engineer gave it a go. Therefore, real automated End-to-End regression testing was not achieved in those companies, where a traditional CI server is used to execute automated end-to-end tests.
Related reading: