ChatGPT is Useless for Real Test Automation

Avoid yet another hype

Jan 01, 2023

∙ Paid

Coding Jag (a software testing newsletter) #120 featured three articles on ChatGPT:

ChatGPT for testers
Using ChatGPT for Test Automation
ChatGPT for iMessage, Slack, user feedback, and beyond

I am aware of this because my daughter’s article, Playwright vs Selenium WebDriver Syntax Comparison by Example, is also featured in Coding Jag #120.

Some of you might have heard of “ChatGPT” and AI in Test automation in general. I will give my conclusion first: “ChatGPT is totally useless for Real Test Automation”. I will illustrate with ChatGPT examples.

Table of Contents:
· A Simple User Login Test
∘ My Assessment
∘ How about a more complex test scenario?
· Argument 1: “It generates skeleton”. Yes, in the wrong way.
· A much better way to generate a test skeleton.
· Argument 2: “It generates test data”. Yes, again, there is a better way.
· Argument 3: “By using ChatGPT in more sessions, you can train it to do better”

A Simple User Login Test

A user login test in Test Automation is regarded as the “HelloWorld” in coding. Logically, we expect AI Testing to score perfectly on this simplest and well-trained test scenario. Let’s see how well ChatGPT performs.

1. My Request:

2. ChatGPT’s Answer (part 1: test design in steps)

3. ChatGPT Answer (part 2: test script)

My Assessment

I will just focus on the test script part (the test design is not worth discussing). Please note, I just specified “user login automated test”, not mentioning any framework or scripting language.

Pros:

The test script uses the Selenium WebDriver framework.
A Good choice. Real automated testers take this for granted, however, if you see how many fake automated testers promoting Protractor (deprecated), Test Cafe (hardly seen now), Cypress and Playwright, at least ChatGPT did not make a fundamental error, sticking with W3C’s WebDriver standard.
The test script is in Python, a scripting language.
Automated Test Scripts Shall be in the Syntax of a Scripting Language, such as Ruby or Python. We have seen many fake automated testers use C#, Java and JavaScript (JS is not a pure scripting language). Not long ago, I saw a survey, Java, a compiled language, is mostly used in the Selenium community. I will show another proof shortly, related to ChatGPT.
ChatGPT uses Python over Ruby for an obvious reason, its involvement with Python language.

Cons:

It is not a real test case.
A real automated test case shall be in a test framework, such as JUnit for Java, and RSpec for Ruby. PyTest is a popular test framework in Python, but ChatGPT does not use it. Its assertion is done by the language keyword assert .
What if I want to add an alternative test scenario “User login failed” (a very common one, almost a must) later?
The default browser, Firefox, is wrong.
driver = webdriver.Firefox()
I don’t need to show the stats to state that Chrome is the dominant browser. It does not need high intelligence to figure that out.
The test script is invalid!
`driver.find_element(By.ID, “login”).click()`
There is no login button on the login page, rather, a “Sign in” button, <input type='submit' name='commit' value='Sign in'>

driver.find_element(By.LINK_TEXT, 'Logout').click()
There is no “Logout” link, rather, a “Sign off” link.

There is also a wrong assumption of “Dashboard”.

In other words, ChatGPT does not analyze

https://travel.agileway.net

and do some magic AI work to generate the dedicated and working test scripts. To me, it is totally useless. I am saying this as a real test automation engineer who has been developing thousands of requested (non-generic) automated test scripts for over 17 years.

“The reason that you and your team are hired is to develop a custom software solution. A generic AI bot won’t help you” — Zhimin Zhan

How about a more complex test scenario?

The above is a user login test, a test automation script in its simplest form. Yet, ChatGPT did it wrongly!

Let me pretend to be a naive ‘automated tester’ that wishes AI Testing could help me to do the work for me: write an automated test case for a work business scenario.

The test script ChatGPT generated:

from selenium import webdriver
from selenium.webdriver.common.by import By

def test_make_booking():
  # Set up the WebDriver instance and open the website
  driver = webdriver.Firefox()
  driver.get("https://whenwise.agileway.net/login")

  # Navigate to the login page and verify that the login form is displayed
  assert driver.find_element(By.ID, "email").is_displayed()
  assert driver.find_element(By.ID, "password").is_displayed()
  assert driver.find_element(By.ID, "login").is_displayed()

  # Enter the email address and password and click the login button
  driver.find_element(By.ID, "email").send_keys("james@client.com")
  driver.find_element(By.ID, "password").send_keys("password")
  driver.find_element(By.ID, "login").click()

  # Verify that the user is successfully logged in and redirected to the dashboard
  assert "dashboard" in driver.current_url

  # Navigate to the page for booking a driving school
  driver.get("https://whenwise.agileway.net/bookings/driving-school")

  # Select the desired date and time for the booking and fill out any other necessary form fields
  driver.find_element(By.ID, "date").send_keys("2022-12-30")
  driver.find_element(By.ID, "time").send_keys("09:00")
  driver.find_element(By.ID, "instructor").send_keys("John Smith")

  # Click the submit button to make the booking
  driver.find_element(By.ID, "submit").click()

  # Verify that the booking is successful and the user is redirected to a confirmation page
  assert "booking-confirmation" in driver.current_url

  # Log out of the account
  driver.find_element(By.LINK_TEXT, "Logout").click()

  # Close the web browser
  driver.

Of course, it is all wrong.

No “/login” page, shall be /sign-in instead
No “#login” button, instead #login-btn .
The booking part is totally not related to WhenWise.

Please note the syntax error in the last step, shall be driver.close() .

Is the AI bot on strike? Or hitting the limit?!

Please note this is still a quite simple and common scenario. In reality, the work business scenario is much more complex, such as submitting an insurance claim, which involves maybe hundreds of steps.

Argument 1: “It generates skeleton”. Yes, in the wrong way.

Yes, ChatGPT does generate a test skeleton. Some think that is useful, but I don’t. I will show you a better way to get a better and more flexible test skeleton soon. Before that, I will let ChatGPT do another exercise.

This time, the scripting language is Java (I did not specify in the request, as my previous one), a not-suitable non-scripting language.

As you can see, besides the language differences, what ChatGPT provides is a non-working test skeleton. This, in fact, is bad. It violates a rule in test automation, a testing skeleton should still be valid.

A much better way to generate a test skeleton.

When creating a new test project in TestWise IDE, a tester specifies the framework and the target server URL.

You get a project structure and a sample test in the selected frameworks (automation and syntax).

A tester can immediately run this test case (right-click any line in the test case and select ‘Run …’).

Test passed, and the browser opened the target site.

A Chrome browser will launch and open the target website. Then a tester works on the step-by-step efficiently in TestWise debugging mode (attaching the browser).

Some might argue, “It provides the test skeleton, but not test steps”. True, but the so-called ChatGPT-generated test steps are completely wrong, a generic version with nothing to do with your app. From my coding/scripting experience, bad/invalid codes worsen things. Remember the “Broken Window Theory” (yes, it is related to software development, I read it first in the classic ‘The Pragmatic Programmer’ book).

If you really want to inject design steps in the test script somehow, there is one way in TestWise. You can import test design steps from Jira.

Typing a user story ID in the test case name, then invoke the last button on the left panel. Click the “Use requirement as test case”.

TestWise will change the test case name to match your story and insert test steps in test scripts as comments. Different from ChatGPT's not-working test script, the commented test steps (added by TestWise) are not executable, but at least they are not wrong!

Argument 2: “It generates test data”. Yes, again, there is a better way.

Some might acknowledge that “Yes, ChatGPT is relatively new, and is not trained to learn to provide custom and working test scripts yet. But we can use it to help test automation, such as test data generation”.

Here is a request (typed by my daughter).

Yes, it did provide five fake emails.

Keep reading with a 7-day free trial

Subscribe to The Agile Way to keep reading this post and get 7 days of free access to the full post archives.