Case Study: Use Selenium To Extract A List of Published Medium Article Titles
A helpful utility for high-volume Medium bloggers
Medium authors know that only 20~25 stories (under ‘Drafts’ and ‘Published’) are displayed on your page. When you scroll down to the bottom, 20 or so more stories, will load. This is known as lazy loading (or progressive loading).
For high-volume Medium bloggers, who have hundreds of stories under ‘Published’ and ‘Drafts’, finding a specific article is quite time-consuming. My father asked me to write a simple Selenium utility to help.
I already wrote an article, “Automated Testing Elements on a Lazy Load Page with Selenium WebDriver”, to show how to automate lazy loading pages in Selenium WebDriver. Let’s put it into practice.
Test Design
Login to Medium account, pass authentication manually
It is not a good idea to store user name and passwords in your scripts. For this utility, I can use TestWise’s attaching-session feature to by-pass authentication manually.Keep scrolling down until reaching the end.
For this, scroll to a big “enough” number. It isn’t necessary to guarantee the end is reached.
This will take a while, treat yourself to a cup of coffee while it is running.Extract story titles
Easy with Selenium WebDriver.Save all story titles into a text file.
Easy with Ruby.
Steps
1. Preparation
Create a Selenium-RSpec project in TestWise.
2. Run the empty test case
This purpose of doing this is to get a Chrome browser session that TestWise can attach to.
Please note, right-click a line within the test case and select the first Run “…”
option. This is called “individual test execution mode” in TestWise.
You shall see a Chrome browser session start, opening the Medium home page.
3. Run the the empty test case
Type in driver.find_element(:link_text, “Sign in”).click
in the test case, select that line, then right-click and choose theRun Selected Script Against Current Browser
option:
This enters TestWise’s “debugging mode”.
Keep reading with a 7-day free trial
Subscribe to The Agile Way to keep reading this post and get 7 days of free access to the full post archives.