Automated Testing PDF Download in Selenium WebDriver
How to test downloading PDFs in Selenium WebDriver
This is also included in the “How to in Selenium WebDriver” series. You can find more Selenium examples like this one in this eBook: Selenium WebDriver Recipes in Ruby.
Many websites feature links that download a PDF document. These PDF files might be static (e.g. a restaurant menu) or dynamically generated (e.g. a receipt).
This tutorial will show you how to download a PDF document and verify its contents in a Selenium WebDriver automated test script.
Table of Contents:
· Test Design
∘ Saving the download file to a specific location
∘ PDF verification library
· Open browser with specified download folder
· Download and Verify the downloaded file
· Verify the PDF
∘ Verify PDF page count
∘ Verify PDF contents
· Completed Test Script
Test Design
Navigate to a web page and download the PDF
For my example, I’m downloading a book sample PDF (static) at http://zhimin.com/books/pwta.Verify the downloaded PDF exists
Once the file is downloaded from the browser, check if it is downloaded successfully on the machine.Read and verify the PDF’s contents
It’s good practice to verify the PDF’s content, just to make sure.
Saving the download file to a specific location
By default, the browser will save downloaded files into a special folder, such as /Users/Me/Downloads
on macOS. This folder might change on the Windows platform or even with permission issues when used by test automation. To test safely and avoid conflicts, we should specify the download folder in the automated test scripts.
PDF verification library
I use the ‘pdf-reader’ gem, a PDF parser library, to verify the PDF. Install it from the command line:
gem install pdf-reader
Open a browser with the specified download folder
To set a download location in Selenium WebDriver, we can set it in the browser (Chrome) options.
before(:all) do
# set up download settings
@download_path = "/Users/zhimin/tmp"
options = Selenium::WebDriver::Chrome::Options.new
options.add_preference("download.prompt_for_download", false)
options.add_preference("download.default_directory", @download_path)
@driver = Selenium::WebDriver.for(:chrome,
:capabilities => options)
driver.get(site_url)
end
Setting prompt_for_download
to false means that you won’t receive a pop-up asking what to name the file and where to save it to. The default_directory
is the download location we want Chrome to save into.
Run the test script to start a Chrome browser, then do a quick check for downloaded files.
Download and Verify the downloaded file
To download the PDF, click the web app's download link. In the test script, I added some delays to allow time for the file download to complete.
driver.find_element(:link_text, "Download").click
sleep 10
The
10
is the maximum limit, shall take the slow connection into consideration.
I don’t recommend to use high fixed wait, instead, using polling retry, see this article: Test AJAX Properly and Efficiently with Selenium WebDriver, and Avoid ‘Automated Waiting’ .
After that, we want to verify if the PDF is there. This can be easily done by using Ruby’s File.exists?(file_path)
function.
expect(File.exists?("#{@download_path}/sample.pdf")).to be_truthy
Note that
expect(...).to be_truthy
is equivalent toexpect(...).to eq(true)
. However, I findbe_truthy
is more readable thaneq(true)
.
Run the test script, then succeed!
it "Download PWTA sample" do
visit("/books/pwta")
saved_file = "#{@download_path}/practical-web-test-automation-sample.pdf"
FileUtils.rm(saved_file) if File.exists?(saved_file) driver.find_element(:link_text, "Download").click
sleep 10
expect(File.exists?(saved_file)).to be_truthy
end
Note, to be sure, we shall delete that destination file before checking it. A common pattern.
FileUtils.rm(saved_file) if File.exists?(saved_file)
Verify PDF
We aren’t entirely done yet. How can we be sure the PDF we downloaded is valid (openable) and correct (contents-wise)?
Here I will use the PDF reader gem pdf-reader
to extract the text contents and verify.
First, let’s load the PDF file.
reader = PDF::Reader.new("#{@download_path}/practical-web-test-automation-sample.pdf")
Verify PDF page count
Use pdf-reader
’s page_count
.
expect(reader.page_count).to eq(62)
Verify PDF contents
We can extract the text version of the PDF document using pdf-reader
, which works by treating each page separately. This means that you will need to loop through all the pages to read the whole PDF or use indexing to go to a particular page.
For this sample PDF, the first page is the cover image. I will verify the second page.
second_page_text = reader.pages[1].text
expect(second_page_text).to include("Test web applications wisely with Selenium WebDriver")
Apart from the text content, pdf-reader
can also extract PDF metadata, page orientation and raw-content streams, which may be helpful in assertions.
Completed Test Script
load File.dirname(__FILE__) + "/../test_helper.rb"
require "pdf-reader"
describe "PDF Download and Verification" do
include TestHelper
before(:all) do
@download_path = "/Users/me/tmp"
options = Selenium::WebDriver::Chrome::Options.new
options.add_preference("download.prompt_for_download", false)
options.add_preference("download.default_directory", @download_path)
@driver =Selenium::WebDriver.for(:chrome, :capabilities => options)
driver.get(site_url)
end
after(:all) do
driver.quit unless debugging?
end
it "Download Practical Web Test Automation sample" do
visit("/books/pwta")
saved_file = "#{@download_path}/practical-web-test-automation-sample.pdf"
FileUtils.rm(saved_file) if File.exists?(saved_file)
driver.find_element(:link_text, "Download").click
sleep 5
expect(File.exists?(saved_file)).to be_truthy
reader = PDF::Reader.new(saved_file)
puts reader.info
expect(reader.page_count).to eq(62)
second_page_text = reader.pages[1].text
puts second_page_text
expect(second_page_text).to include("Test web applications wisely with Selenium WebDriver")
end
end
Notes
I recommend using a relative path to the test script for the download directory, see the script snippet below. For simplicity, I used an absolute path in the above, but if the test is run on a different machine/shared, it could cause problems.
@download_path = File.expand_path File.join(File.dirname(__FILE__), "..", "tmp", "download")
How about Firefox?
The above Selenium settings are specific for Chrome. The same concept applies to other browsers as well. Below is for Firefox.
download_path = "/Users/me/tmp"
profile = Selenium::WebDriver::Firefox::Profile.new
profile["browser.download.folderList"] = 2
profile["browser.download.dir"] = download_path
profile["browser.helperApps.neverAsk.saveToDisk"] = 'application/pdf'
# disable Firefox's built-in PDF viewer
profile["pdfjs.disabled"] = true
options = Selenium::WebDriver::Firefox::Options.new
options.profile = profile
driver = Selenium::WebDriver.for(:firefox, :options => options)