Skip to content

Using Watir to Automate Web Browsers with Ruby

As a developer or QA engineer, you likely find yourself needing to perform repetitive actions in a web browser as part of your testing or data collection processes. Manually clicking through the same pages and entering the same data over and over is tedious and inefficient. That‘s where browser automation comes in. By writing scripts to automatically control a web browser, you can offload this manual work and focus on more important tasks.

One of the premier tools for automating browsers using the Ruby programming language is Watir. Watir, which stands for "Web Application Testing in Ruby", is a set of open source Ruby libraries that enable you to programmatically drive browsers like a real user would – clicking links, filling out forms, extracting data, and more. It provides a clean, easy-to-use API for automating browser interactions.

In this guide, we‘ll walk through how to set up Watir and use it to automate common web browsing tasks. By the end, you‘ll be able to supercharge your workflow by letting bots do your browsing for you! Let‘s get started.

Getting Started with Watir

Before we can start automating browsers, we need to make sure we have all the necessary tools installed. Watir has two key dependencies:

  1. Ruby – Watir is a Ruby library, so you‘ll need to have Ruby installed on your system. Check if it‘s already installed by running ruby -v in your terminal. If you don‘t have it, you can download an installer from the official Ruby website.

  2. Webdriver – Watir uses the Webdriver interface to communicate with browsers. Different browsers have different webdriver implementations (e.g. chromedriver for Chrome, geckodriver for Firefox). The simplest way to install these is via the webdrivers gem.

Assuming you have Ruby set up, install Watir and webdrivers with:

gem install watir
gem install webdrivers

With those gems installed, you‘re ready to start automating! Watir works with all the major browsers including:

  • Chrome
  • Firefox
  • Internet Explorer
  • Safari
  • Edge (Chromium-based)

The examples in this tutorial will use Chrome, but the code will work with any of the supported browsers.

Watir Fundamentals

Watir mirrors the way a human interacts with a browser. Before performing actions on a webpage, you first need a browser instance to work with. In Watir, this is handled by the Browser class:

require ‘watir‘
browser = Watir::Browser.new

By default, this launches a new Chrome browser instance. To use a different browser, pass it as an argument when instantiating the Browser:

browser = Watir::Browser.new :firefox

Now that we have a browser, we need to navigate to a webpage to do something useful. You can pointed the browser to a URL using the goto method:

browser.goto ‘http://google.com‘

The real power of Watir lies in its ability to locate and interact with elements on the page. Watir provides a variety of methods for finding elements based on their HTML attributes and text context. The most commonly used are:

  • text_field – locate a text input field
  • button – locate a button
  • link – locate a hyperlink
  • select_list – locate a dropdown list

For example, to find the search box on the Google homepage, we could use:

search_bar = browser.text_field(name: ‘q‘)

This finds the first text field with a "name" attribute equal to ‘q‘. We can then interact with the element like we would if controlling the browser by hand. To enter a search query, use the set method:

search_bar.set ‘watir automation‘

To submit the search, we need to locate the form submission button and click it:

submit_button = browser.button(name: ‘btnK‘) 
submit_button.click

After the search results load, we could extract data from the page using Watir‘s methods for accessing element properties:

results = browser.divs(class: ‘g‘)
results.each do |result|
  puts result.link.text
  puts result.link.href
end

This code finds all the div elements with a class of ‘g‘ (which google uses for search result blocks), and prints the text and URL of the link within each.

An important concept to understand when writing Watir scripts is synchronization – making sure elements are ready to be interacted with before your script attempts to access them. Watir has built-in waiting behavior and will automatically wait for an element to be present before interacting with it, but sometimes you need more granular control. Watir provides explicit wait methods like wait_until to handle this:

browser.link(text: ‘Slow Loading Link‘).wait_until(timeout: 10, &:present?).click

This code will wait up to 10 seconds for a link with the text ‘Slow Loading Link‘ to be present on the page before attempting to click it. If the element doesn‘t appear within that time, a Watir::Wait::TimeoutError will be raised.

JavaScript dialog boxes like alerts, confirms, and prompts are common in modern web apps. Watir allows you to handle these via the Alert class:

browser.alert.ok
browser.alert.close
browser.alert.text
browser.alert.set 

The ok method will accept an alert, while close will dismiss it.

Example Automation Script

Let‘s bring together the concepts we‘ve covered so far into an example script that searches Wikipedia and captures some data from the results. We‘ll use the english version of Wikipedia and search for articles related to "Browser Automation". The code below demonstrates the following steps:

  1. Launch a browser and navigate to wikipedia.org
  2. Click the "English" link to go to the english version
  3. Find the search box and enter "Browser Automation"
  4. Click the search button to submit the query
  5. Wait for results to load and print the number of results
  6. Loop through search results and print titles and URLs
  7. Take a screenshot of the results page
  8. Close the browser
require ‘watir‘

browser = Watir::Browser.new
browser.goto ‘https://www.wikipedia.org‘

browser.link(text: ‘English‘).click

search_bar = browser.text_field(name: ‘search‘)
search_bar.set ‘Browser Automation‘

browser.button(name: ‘go‘).click

browser.div(id: ‘mw-content-text‘).wait_until(&:present?)
puts "Results found: #{browser.div(id: ‘mw-content-text‘).p.text}"

browser.divs(class: ‘mw-search-result-heading‘).each do |result|
  puts result.link.text
  puts result.link.href
end

browser.screenshot.save ‘wikipedia-results.png‘

browser.quit

This script demonstrates the core Watir APIs and techniques you‘ll use for any kind of browser automation. It can easily be adapted for data collection, application monitoring, or testing workflows.

Tips for Robust Browser Automation

Browser automation is a powerful tool, but scripts can be brittle. Here are some tips to keep in mind as you write your own automation code.

Use Explicit Waits
We touched on using wait methods like wait_until above. Use these judiciously in your script rather than hard-coded sleeps or pauses. Well-placed explicit waits make your scripts faster and more resilient to changing page load times.

Handle Exceptions
Web pages are dynamic and don‘t always load the same way every time. Use begin/rescue blocks to catch exceptions that may be thrown if an element doesn‘t exist or an action fails. Having proper exception handling will keep your script from hanging or exiting prematurely.

Modularize Scripts
As you start to write more complex automation workflows, your scripts can become long and difficult to maintain. Break up your code into smaller, reusable functions. Consider using a page object model to encapsulate page-specific logic.

Add Logging
When running automation headlessly (i.e. without the browser UI), it can be difficult to tell what‘s going on. Use puts statements or a logging library to output useful debug information as your script runs. This will make it much easier to diagnose problems.

Conclusion

Watir is a powerful tool for automating browser interactions with Ruby. In this guide, we‘ve covered the fundamentals of driving a browser with Watir and walked through an example script to search Wikipedia. The Watir API provides a robust set of methods to find, interact with, and extract data from elements on a page.

As you‘ve seen, browser automation opens up a lot of possibilities for streamlining repetitive web-based tasks. With a little bit of Ruby knowledge, you can write scripts to scrape data, run tests, or monitor web applications.

This tutorial only scratches the surface of what you can do with Watir. To learn more, check out the Watir project documentation and these additional resources:

Happy automating! The possibilities are endless.

Join the conversation

Your email address will not be published. Required fields are marked *