Case Study: Wait for File Download to Complete Safely in Selenium

How to increase reliability in an automated test for a file download.

Courtney Zhan

and

Zhimin Zhan

Jan 28, 2023

∙ Paid

This article will show you how to verify a file download completes successfully in Selenium WebDriver.

Table of Contents:
∘ Test Design
∘ Fixed Wait
∘ Check the Downloaded File’s Size
∘ Wait for the Browser Download to Complete
∘ Complete Test Script
∘ Zhimin’s Notes

The test site for this article is http://zhimin.com/books/pwta. There is a sample PDF download that I will use for the tests.

Test Design

The test design is quite straightforward.

Click the download button
Wait…
Verify the file contents

There is an unknown factor with Step 2 — how long should we wait in the test script? And when do we know the file has finished downloading?

There are three approaches:

Fixed wait
Check the downloaded file’s size
Wait for the browser download to complete ✅

Fixed Wait

This approach is very easy to understand. Just hardcode the wait time.

it "Fixed wait time" do
  driver.get("http://zhimin.com/books/pwta")
  driver.find_element(:link_text, "Download").click
  sleep 10 # wait 10 seconds
  expect(File.exist?("/Users/me/Downloads/practical-web-test-automation-sample.pdf")).to be true
end

The above file-download script is not safe, we need to
- set the custom file download path (for Chrome)
- delete the file if already exists, before clicking the ‘Download’.
This has been covered in this article, Automated Testing PDF Download in Selenium WebDriver.
This article just focus on the waiting part.

This simple approach does have an apparent drawback, as download speeds are inconsistent. Tweaking the timing is difficult, and setting the wait time too long is suboptimal, i.e. wasteful.

Zhimin: to get an automated test script working 80% or 90% of the time is quite easy. However, the goal of test automation is Continuous Testing, running all tests daily. This means, test automation engineers need to ensure the test scripts 99+% reliablity.
Check out this article, Working Automated Test ≠ Good Reliable Test

Check the Downloaded File’s Size

An improvement for the above is to ensure the file is fully downloaded, by checking its file size, File.size(file_path) . We can use code to check the file periodically against the expected file size (known earlier).

However, with Chrome (not sure since which version), the download-in-progress is randomly named.

The temporary file was created by Chrome during the download.

This approach may work for other browsers, such as Firefox. Anyway, I don’t recommend this, there is a better way.

Wait for the Browser Download to Complete

In this method, we use the browser’s Downloads page to check the download's progress.

On the Downloads page, the downloaded items (downloadsList) are all under the <downloads-manager>’s shadow root.

For automating elements in Shadow DOM (i.e. Shadow root), check out my other article, Automating Shadow DOM with Selenium WebDriver.

To get inside the shadow root, use JavaScript’s .shadowRoot.

// return list of all downloaded items from the shadow root's downloadsList
return document.querySelector('downloads-manager')
               .shadowRoot
               .getElementById('downloadsList')
               .items;

The above block can be executed with Selenium’s driver.execute_script. With this, we can start to write our test script:

driver.get("http://zhimin.com/books/pwta")
driver.find_element(:link_text, "Download").click
sleep 0.2 # give time for download to begin

driver.get("chrome://downloads")
# get download list under shadowRoot using JS (see above)
items = driver.execute_script("return document.querySelector('downloads-manager').shadowRoot.getElementById('downloadsList').items;")
expect(items[0]["fileName"]).to eq("practical-web-test-automation-sample.pdf")

Keen readers may notice that I still added a fixed time wait after clicking the ‘Download’ button. It takes time for the download to begin, Chrome to do a security scan, etc., so I added a small wait.

The downloaded items show up in reverse chronological order. This means the latest download will be first. i.e. items[0] will be the sample book’s PDF.

Next, we want to know the download status (IN_PROGRESS, COMPLETE or CANCELLED?). Each downloaded item has an attribute, state, that we can use. To poll if the download is finished, simply loop the JavaScript block until the state is not IN_PROGRESS.

while items[0]["state"] == "IN_PROGRESS"
  sleep 1
  items = driver.execute_script("return document.querySelector('downloads-manager').shadowRoot.getElementById('downloadsList').items;")
end

Finally, assert that the download was completed successfully and the file exists!

expect(items[0]["state"]).to eq("COMPLETE")
expected_download_file_path = File.expand_path "~/Downloads/practical-web-test-automation-sample.pdf"
expect(File.exist?(expected_download_file_path)).to be true

Complete Test Script

it "Wait Browser Download To Complete" do
  expected_download_file_path = File.expand_path "~/Downloads/practical-web-test-automation-sample.pdf"
  File.delete(expected_download_file_path) if File.exists?(expected_download_file_path)
  
  # click download Sample button
  driver.find_element(:link_text, "Download").click
  sleep 0.2 # give time for download to begin

  driver.get("chrome://downloads")
  items = driver.execute_script("return document.querySelector('downloads-manager').shadowRoot.getElementById('downloadsList').items;")
  expect(items[0]["fileName"]).to eq("practical-web-test-automation-sample.pdf")

  while items[0]["state"] == "IN_PROGRESS"
    sleep 1
    items = driver.execute_script("return document.querySelector('downloads-manager').shadowRoot.getElementById('downloadsList').items;")
  end

  expect(items[0]["state"]).to eq("COMPLETE")
  expect(File.exist?(expected_download_file_path)).to be true
end

Zhimin’s Notes

The above version is still not Continuous-Testing-Level ready. Courtney did not handle the stuck or extremely slow scenario. When that happens, it may take several minutes. This is a rare condition, still, we much prefer the test fail earlier. (In case you wonder, auto-rerun and manual-rerun of a CT Server, such as BuildWise, can handle this)

Keep reading with a 7-day free trial

Subscribe to The Agile Way to keep reading this post and get 7 days of free access to the full post archives.