Why Visual Regression Testing is Wrong, Mostly?

Mostly wasteful activity. QA engineers focus on Automated E2E UI Functional Regression Testing, as should be.

Nov 20, 2024

A few weeks ago, one mentee asked me, “What is Visual Testing?”.

I replied, “Did someone want to showcase Visual Testing at work, for regression testing? If so, let me guess, he or she is a developer. ”

The mentee said with a surprised tone, “Yes”.

This article is a write-down of my explanation to this mentee.

Table of Contents:
· What is Visual Testing?
· Why Visual Regression Testing is Mostly Wrong?
∘ The Story
∘ Visual Regression Testing is Expensive and Not Practical
· Why do People waste time doing this kind of useless Test Automation?

What is Visual Testing?

“Visual testing is a software testing technique that evaluates the visual appearance and behavior of a software application’s user interface (UI) or graphical user interface (GUI). Visual testing aims to verify that the application’s visual elements like colors, images, fonts, and layouts, are displayed correctly and consistently across different devices, operating systems, and browsers.” — BrowserStack.

According to this definition, Visual Testing is mainly used to check visual appearance “across different devices”. Visual testing is also known as Snapshot Testing or Screenshot Testing.

I’m not suggesting that visual testing had absolutely no value, but it would seldom be required. I never had a need to use software automation to check the UI appearance across different devices during my 26 years of working in the software industry. Manual visual checking was fine.

Some will say, “With the advancement of testing software, we can now do Visual Regression Testing?”. Really? I doubt it.

Why Visual Regression Testing is Mostly Wrong?

I started telling the mentee a story that happened ~17 years ago, which I wrote down in the “Regression Testing Clarified”.

The Story

Many years ago, I joined a large .NET project as a contract software engineer. Technically, the project was a mess, and many quality (functional) issues were reported. The star programmer Rob, who was a senior technical consultant at a well-known international software consulting firm, proposed automation, especially for regression. A few weeks later, an internal demonstration meeting was arranged.

The newly joined project manager, who worked with me on the previous project, suggested that I show my approach, too. I accepted, which I later thought was a mistake (not technically, but embarrassed my colleagues). For this reason, I remember this session quite well.

In the meeting, Rob demonstrated his approach to regression testing:

Invoke a test to send input to the application (a workflow product), and save the output to a file, e.g. case1–1209.txt
On the next build candidate, repeat the above to save to a different output file.
Run a diff tool to compare these two files; if they are the same, no regression issues.

I was quite shocked. This naive ‘regression testing’ demo was well received by other team members. The tech lead even said, “cool”.

Then I came up and showed my regression testing, which I used for my own work for a while. I picked up a case like Rob’s. Different from Rob’s regression testing approach, mine is simpler:

Verify every step inside the test scripts

If all steps pass, the regression run of the test case passes! Also, I showed a report of the execution history (daily for ~3 weeks) of my test suite (~50 test cases).

There was a moment of silence after my demo. Clearly, everyone realized this was real regression testing, as it did real verification. Comparing the output approach was no good:

If there is timestamp info, the output will never be the same
adding an extra print statement will cause ‘regression failure’
we have no idea (from examining the output) which step failed.
there were no assertions, i.e., no testing.

After seeing my demo, the team members probably were embarrassed (by their reactions to Rob’s demo). The PM was wise to notice the awkwardness. He stood up and quickly wrapped up the meeting.

Visual Regression Testing is Expensive and Not Practical

In the story above, a senior developer, with limited knowledge of test automation and regression testing, attempted to impress others with a fancy demonstration. His grasp of regression testing was confined to simply comparing the output. In that regard, not much different from “take-screenshot-then-compare” style visual regression testing.

Expensive
Consider the efforts to save the screenshots for each page for each release, then compare (at least two).
Extremely Low Value
From an end-user’s perspective, minor UI differences do not matter at all. They care about functionalities, not a button was slightly more gradient than the previous release. Manual testers and business analysts have checked the app's UI, many times.
Brittle
As we know, UI designers constantly refine the app’s look and feel. A simple change of CSS, e.g. margin-top: 2px, while visually appealing fine (to end-users), would mess with screenshot-style regression testing.
Meaningless report
I will list two “visual regression testing metrics” in this article: “Difference Percentage” and “Height Mismatch”. I could easily check for height with normal E2E testing (see Automated Testing Charts in Selenium WebDriver). With “difference percentage”, shall the team act on 1% or 5% ?

If some readers still want to debate, let me ask you a simple question, probably making them silent.

What is your automated End-to-End (via UI) regression testing report today (and yesterday, …)?

Surely, functional testing is the most important.

The heading of World Quality Report 2018–2019, https://www.capgemini.com/au-en/service/world-quality-report-2018-19/

There are two possible answers to my question:

Doing Automated E2E (via UI) regression testing well.
Speaking from my experience (see Showcase a 500+ End-to-End (via UI) Test Suite: E2E Test Automation), as a test automation engineer, I spent ~70% of the total software development effort (including requirement, coding, testing, support, …) on E2E UI Functional Test Automation and Continuous Testing for all my apps. Because it provides the most value, making the team much more productive and enabling “daily production releases”.

A good E2E regression testing encourages developers to make more changes (which is a good thing). This leads to more test runs and more maintenance work.
Frankly, I don’t think you have time for this ‘visual regression testing’. Every software project has a limited budget and deadline. Use your time and resources wisely. For me, the most effort is on Automated End-to-End (via UI) Functional Regression Testing.
No or poor Automated E2E (via UI) regression testing
If so, visual regression testing is mostly a waste of time, don’t you agree?

Why do People waste time doing this kind of useless Test Automation?

There are two reasons: show-off (by developers) and pretend working in test automation to justify their existence (by fake automated testers).

Recently, I noticed the trend of fake automated testers doing meaningless automated testing, such as

Cypress Component Testing
This crossed the line of ‘Black-Box Testing’. How silly?
“QA Engineers, Stay Out of Cypress Component Testing, for Your Own Sake!”
Cypress API Testing
API Testing, of course, does have value. API testing shall and better be done using HTTP libraries, which every programming language has. The problem here is using a browser e2e test automation tool. A question: “Shouldn’t a Cypress tester get the E2E (browser-based) test automation done solid before worrying about API Testing?”
“Cypress API Testing Makes No Sense”
and Visual Regression Testing

Back to the story of that mentee. After seeing a colleague's demonstration of ‘Visual Regression Testing’, she messaged me, “Pretty much as you said. It is meaningless; Everyone forgot about that quickly.”

The Resources Zhimin Created to Help Achieve Real End-to-End Test Automation & Continuous Testing

Further reading:

My eBooks:
- “Practical Web Test Automation with Selenium WebDriver”
- “Practical Continuous Testing: make Agile/DevOps real”
How to Implement Real Automated Regression Testing?

The Agile Way

Discussion about this post