Use Advanced User Interactions in Selenium WebDriver to Drive Keyboard and Mouse

Perform Complex Mouse and Keyboard Operations in Selenium WebDriver

Nov 17, 2022

You can find more Selenium examples like this one in my eBook: Selenium WebDriver Recipes in Ruby. This is also included in my “How to in Selenium WebDriver” series.

The ActionBuilder in Selenium WebDriver provides a way to set up and perform complex user interactions. Specifically, grouping a series of keyboard and mouse operations and sending them to the browser.

Mouse interactions

click
click_and_hold
context_click
double_click
drag_and_drop
drag_and_drop_by
move_by
move_to
release

Keyboard interactions

key_down
key_up
send_keys

These functions in Advanced User Interactions are self-explanatory. This article will show some examples of how to use them.

The usage

driver.action. + one or more above operations + .perform

Check out the ActionBuilder API in Ruby language for more. The syntaxes in other languages are similar.

Mouse Over

When a mouse is moved over the email field below,

A tip ( must contains @ ) shows up.

The HTML fragment

<input id="email" name="email" type="email" style="height:30px; width: 280px;" data-toggle="tooltip" data-placement="right" title="must contains @">

Test Script

driver.navigate.to(site_url + "/html5.html")
elem = driver.find_element(:id, "email")
driver.action.move_to(elem).perform
sleep 1
expect(page_text).to include("must contains")

Double Click

When you double-click the text “Quick Fill” below,

The password field will automatically be filled (by JavaScript).

The HTML fragment

<input type="password" name="password" id="pass">
<span id="quickfill" ondblclick="quick_fill()">Quick Fill(double click)</span>

<script>
function quick_fill() {
  $("#pass").val("ABC");
}
</script>

Test Script

driver.navigate.to(site_url + "/text_field.html")
quick_fill_elem = driver.find_element(:id, "quickfill")

# double click to fill
driver.action.double_click(quick_fill_elem).perform
sleep 0.2
expect(driver.find_element(:id, "pass")["value"]).to eq("ABC")

Click and Hold

Select the item boxes 6 to 8 in the following grid.

Screenshot selecting item boxes 6, 7 and 8 in one mouse hold. Screenshot from https://jqueryui.com/selectable/#display-grid website.

The HTML fragment

<ol id="selectable" class="ui-selectable">
 <li class="ui-state-default ui-selectee">1</li>
 <li class="ui-state-default ui-selectee">2</li>
 <li class="ui-state-default ui-selectee">3</li>
 ...
 <li class="ui-state-default ui-selectee">12</li>
</ol>

Test Script

driver.navigate.to("http://jqueryui.com/selectable")
sleep 1
driver.find_element(:link_text, "Display as grid").click
sleep 1
driver.switch_to.frame(0)
list_items = driver.find_elements(:xpath, "//ol[@id='selectable']/li")
sleep 0.5
driver.action.click_and_hold(list_items[5]).
             .move_to(list_items[7]).release.perform
driver.switch_to.default_content

Drag and Drop

Drag Item 1to the Trash block.

The HTML fragment

<div id="trash" class="ui-droppable over">
    <span>Trash</span>
</div>
<div id="items">
    <div class="item ui-draggable ui-draggable-handle" id="item_1" style="position: relative;">
        <span>Item 1</span>
    </div>
    <div class="item ui-draggable ui-draggable-handle" id="item_2" style="position: relative;">
        <span>Item 2</span>
    </div>
    <div class="item ui-draggable ui-draggable-handle" id="item_3" style="position: relative;">
        <span>Item 3</span>
    </div>
</div>

Test Script

driver.navigate.to(site_url + "/drag_n_drop.html")
drag_from = driver.find_element(:id, "item_1")
target = driver.find_element(:id, "trash")
driver.action.drag_and_drop(drag_from, target).perform

In some cases, click_and_hold + move_to can be an alternative to drag_and_drop. The below works for the above HTML as well.

driver.action.click_and_hold(drag_from).move_to(target).perform
driver.action.release.perform

For some complex pages (with JS), the above drag_and_drop might not work. One issue I encountered was dragging to the current mouse position. There is a workaround, run a JavaScript fragment, such as drag-mock, to do “drag and drop”, viadriver.execute_script in Selenium test script.

jsDragDropSnippet = "!function t(e,r,n){....}"
driver.execute_script("eval(arguments[0]);", jsDragDropSnippet)
dragMockExists = driver.execute_script("return !!window.dragMock;")
raise "Unable to add the drag mock" unless dragMockExists
dragElement = '"' +  "//div[.='Item 1']" + '"';
dropElement = '"' +  "//div[.='Item 3']" + '"' ;
js_script = "var startEle = document.evaluate(" + dragElement + ", document, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue; var endEle = document.evaluate(" + dropElement + ", document, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue;var wait = 150; window.dragMock.dragStart(startEle).delay(wait).dragEnter(endEle).delay(wait).dragOver(endEle).delay(wait).drop(endEle).then(arguments[arguments.length - 1]);"
driver.execute_script(js_script)

Right-Click

Right-clicking will bring up the context menu, where additional actions can be performed.

The HTML fragment

<input type="password" name="password" id="pass">

Test Script

elem = driver.find_element(:id, "pass")
driver.action.context_click(elem).perform

Will show the popup context menu.

As can see, the context menu is rendered by the native OS, i.e, not the browser. In early versions of GeckoDriver, I could use send_keys(:down) to navigate to the menu item in the context menu in Firefox, but it no longer worked.

Move Mouse by Offset

You can move the mouse to a specific coordinate relative to a web element by providing an offset coordinate.

Below is an image with the linked area defined.

Click within WhenWise coords in the image to go to the WhenWise site.

The HTML fragment

<img src="images/agileway_software.png" border="0" width="400" id="agileway_software" usemap="#agileway_software_map">
<map name="agileway_software_map" id="agileway_software_map">
    <area shape="rect" coords="13,16,120,42" href="https://agileway.com.au/testwise" alt="testwise" title="">
    <area shape="rect" coords="13,73,127,100" href="https://agileway.com.au/buildwise" alt="buildwise" title="">
    <area shape="circle" coords="220,30,27" href="https://whenwise.com" alt="whenwise">
</map>

Test Script

elem = driver.find_element(:id, "agileway_software")
driver.action.move_to(elem, 210, 30).click.perform
expect(driver.title).to eq("WhenWise - Booking Made Easy")

The coords is relative to the element.

Send Key Sequences

HTML Fragment

<textarea id="comments" name="comments"></textarea>

Test Script

Select the text (all) in a text area and delete it.

driver.find_element(:id, "comments").send_keys("Multi\r\n Line\r\n Comment")
elem = driver.find_element(:id, "comments")

# using :command on macOS
ctrl_key =  RUBY_PLATFORM.include?("darwin") ? :command : :control
driver.action.click(elem)
      .key_down(ctrl_key)
      .send_keys("a")
      .key_up(ctrl_key)
      .send_keys(:backspace)
      .perform

The above performs the Command + A on macOS (Control + A otherwise), equivalent to:

elem.send_keys([control_key, "a"])
elem.send_keys(:backspace)

As you can see, Selenium’s Advanced User Interaction API provides more controls.

Note that click(elem) is the first action. This is because actions are sent directly to the browser, not a web element (unlike elem.send_keys(...)). So you must focus on the element first — the easiest way to do this is by clicking on it.

Related reading:

My eBooks:
- “Practical Web Test Automation with Selenium WebDriver”
- “Practical Continuous Testing: make Agile/DevOps real”

A guest post by

Courtney Zhan

Software Engineer at Amazon. I'm interested in test automation.

The Agile Way

Discussion about this post