Why Do Most UI Test Automation Fail? (Part 2: Wrong Choice of Test Syntax Framework)
Chose the wrong Gherkin BDD frameworks. Should have used simple RSpec or xUnit (spec style)
What is a test syntax framework?
Automation (also known as Driver) frameworks drive the application’s UI; test syntax frameworks provide the test structure and assertion mechanism.
Have a look at the below test script.
The highlighted (in light yellow) are test steps (or fragments) of an automation framework, in this case, Selenium WebDriver. The rest is provided by the test syntax framework, in this case, RSpec. Line 15 is an assertion step.
The granddaddy of the test syntax frameworks is JUnit created by Kent Beck and Erich Gamma in 1997. As its name suggests, it is designed for unit testing. With JUnit-style frameworks being widely used, JUnit variants were created for integration and functional tests as well.
How could a test syntax framework ruin test automation?
The test syntax framework, as shown above, is very simple as it should be. Some might wonder how a syntax framework could ruin test automation? The answer is Gehkin (so-called Behavior Driven Driven) frameworks.
First of all, BDD ≠ Gherkin. The first and most popular BDD framework is RSpec (2007), the “BDD framework for Ruby”. I think RSpec has just enhanced xUnit (in Ruby of course) to offer more non-programmer-friendly syntax. In my opinion, it was RSpec that made Behaviour Driven Development (BDD) popular. For example, you will find Mocha (in JS) is quite similar to RSpec.
Behaviour Driven Development (BDD) has been a hot term in recent years and has been introduced to software testing. The idea of testing with BDD is not bad. It helps engineers to write tests from a different perspective (vs unit testing). It was all well until people started using Cucumber, another type of BDD framework with “Given-When-Then” syntax, for test automation.
It’s unfortunate that BDD has been gradually misinterpreted as writing automated test scripts in a Gerkin-based syntax framework, such as Cucumber, SpecFlow, and JBehave/Concordion. This is wrong, very wrong, and totally unnecessary.
“So really, what is Cucumber? As a test tool it sucks. There are far better automated test tools” — Aslak Hellesøy, the creator of Cucumber (source)
Gherkin BDD Framework is no good for test automation
The failures of test automation with Gherkin syntax often end with a big embarrassment. The idea was usually from a fake Agile Coach, and the management bought it. The painted picture sounds great: Business analysts can write “Given-When-Then” as executable specifications; then engineers develop automated tests based on the user story on JIRA.
The reality is too good to be true. From my knowledge, every test automation attempt that has been using the Gherkin-based framework failed. The biggest failure I have heard: “The project spent 3 times of development efforts (measured time and money) trying to maintain those cucumber tests, eventually, dumped them to the bin!”
Why is Gherkin bad for test automation? The short answer is “hard to maintain”. If you want to know more, please read my other article: “Why Gherkin (Cucumber, SpecFlow,…) Always Failed with UI Test Automation?”. (this article is featured in ‘Start it up’, the publication on Medium)
What is a good test syntax framework, then?
1. The top tier (the test script) is a scripting language.
Common sense: a test script is in a scripting language.
Good examples are RSpec or Mocha (but I don’t recommend Mocha because JavaScript is a not good language for test scripts. This is a different topic, which I will cover in the next article. But as far as testing syntax framework, Mocha is OK).
Bad examples: All Gherkin frameworks (Cucumber, SpecFlow, … ) and FitNesse.
I have seen many test ‘syntax frameworks’ (some created by developers in the team) in different text formats, such as HTML, XML, JSON, and Gherkin. They all share the same result: failure.
If the top-level scripts are not in the same scripting language as the lower tiers, such as Cucumber (Given-Then-When vs Java/Ruby), an extra parser layer (such as Cucumber’s Step Definitions) is required. This will greatly increase the maintenance effort.
For example, with a good RSpec syntax framework, I can do this.
person_page.enter_birth(17.years.ago.yesterday.strftime('%d/%m/%Y'))
In Gherkin tests,
Enter new person's birth date as the one day just before his/her 17’s birthday in format day/month/Year.
Of course, the above ‘English test step’ will not execute just yet. Test Automation engineers need to write step definitions to interpret these Gherkin steps.
2. Maximum two Tiers
More abstractions (layers) in test scripts only mean more complex,
With Gherkin, three tiers.
And this middle tier will require the most effort in terms of maintenance. If a team is serious about test automation (Level 2+ in AgileWay Continuous Testing Grading, i.e., 50 user story level tests as regression testing), the effort of maintaining this tier alone will kill the test automation.
3. Syntax Validation, Highlighting, and Reformatting
By using a mature scripting language, such as Ruby, you can have many other benefits such as
Syntax Validation
Syntax Highlighting
Easy to reformat (a.k.a pretty-print) test scripts
All of the above will be very helpful in spotting syntax errors.
4. Auto-Completion and Refactoring support
Good testing IDEs can increase test engineers’ productivity by
Auto-Completion (also known as IntelliSense)
The above is not possible with Gherkin frameworks.
5. Flexible for test data
By using a good scripting language in the top tier, test engineers can easily set dynamic test data, like the one below.
new_claim_page.enter_accident_date(2.days.ago)
The above is different from the English text in Gherkin. This is a valid Ruby script (no need for interpretation).
6. Easier to learn and master in minutes
The structure and assertions syntax shall be simple and intuitive to every team member, not just to developers. A good example is RSpec. Below is my under-one-minute tutorial of RSpec to participants who attended my one-day test automation training.
Sample assertion:
export(driver.title).to eq("WhenWise")
expert(driver.page_source).not_to include("Complex")
Arguments
Some readers may argue that “While DHH clearly does not like Cucumber (the tweet I referenced in my article),
He did not like RSpec, the one you are recommending, either.”
DHH likes the simplicity of the xUnit style.
If you read the related posts and discussions (as you can tell from his other tweet), DHH was against using some fancy RSpec features with integration tests. In fact, I have always been using RSpec in a simple way as test/unit or MiniTest, for automated UI tests.
What do I like about RSpec for functional testing?
I develop functional test scripts in a simple form, as DHH suggested. I could use MiniTest, but I prefer RSpec because of its features as below:
RSpec’s syntax for shared sections is more readable.
after(:each)
is better, in terms of readability, thantearDown
, isn’t it?RSpec’s assertion syntax is slightly more readable, e.g.
expect(driver.title).to eq('WhenWise')
I like RSpec’s executing an individual test by the line number, e.g.
rspec login_spec.rb:10
. This is far better than specifying the test name.Instead of using
def test_login_successfully
in the old test/unit, I can use a full sentence as the test case name in RSpec, such asit "login successfully"
. This was the first reason I switched to RSpec fromtest/unit
. The latter Minitest supports the spec (like RSpec) style.RSpec is classified as a BDD framework (well before Cucumber existed). This helps compete against and exclude bad Gherkin ones and differentiate them from unit testing.
In summary, you can write RSpec BDD tests with the simplicity of xUnit and extra benefits.
Recommendation
RSpec. It is simple and widely used (206 million downloads for v3.8.0 alone).
Related Articles:
Why Gherkin (Cucumber, SpecFlow,…) Always Failed with UI Test Automation?
Series: Why Do Most UI Test Automation Fail? (Technical)
- Part 1: Wrong choice of automation framework
- Part 2: Wrong choice of test syntax framework
- Part 3: Wrong test scripting language