What is Playwright? Complete guide with code examples

Playwright is a Node.js library created by Microsoft for automating Chrome, Firefox and WebKit browsers with a single API. In just a few years, it has become one of the most popular frameworks for web automation, testing and scraping.

In this complete guide, we’ll explore what makes Playwright such an awesome tool and how you can use it for everything from end-to-end testing to data extraction.

Why use Playwright?

Playwright allows you to control a browser programmatically with just a few lines of code. This makes it ideal for:

Automating interactions with web pages
End-to-end testing web apps
Web scraping and data mining
Taking screenshots of web pages
Testing browser compatibility of web apps

Compared to similar tools like Puppeteer, Selenium and Cypress, Playwright has some unique advantages:

Support for multiple browsers – Playwright can automate Chromium, Firefox and WebKit.
Multiple language support – You can use Playwright with JavaScript, Python, C# and Java.
Auto-wait mechanism – Playwright automatically waits for elements to be ready before interacting with them, eliminating the need for custom wait logic.
Tracing and debugging – Playwright enables trace logs, screenshots and videos for debugging.
Mobile emulation – Playwright can emulate mobile browsers like Safari on iOS and Chrome on Android.
Cross-platform – Playwright works on Windows, macOS and Linux.

With its speed, reliability and ease of use, Playwright helps you focus on writing tests and automation scripts rather than dealing with waits, sleeps and flaky selectors.

Is Playwright a headless browser?

Playwright is not a headless browser itself, but it can run Chromium, Firefox and WebKit in headless mode.

By default, Playwright launches browsers in headless mode without a GUI. This means you won‘t see the browser launching and running – the scripts will execute behind the scenes.

Headless execution makes tests faster and ideal for CI/CD pipelines. But during development, you can launch browsers in headful mode with:

const browser = await chromium.launch({ headless: false });

This allows you to see the browser and actions as scripts run.

How does Playwright compare to Puppeteer and Selenium?

Puppeteer only supports Chromium and Firefox (experimentally). It is limited to JavaScript/TypeScript.

Selenium supports multiple languages but only Chrome/Edge via WebDriver protocol.

Playwright supports Chrome, Firefox and WebKit with a single API. You can use it with JavaScript, Python, C#, Java etc.

The biggest advantage over Selenium is Playwright‘s auto-wait mechanism. Playwright automatically waits for elements to be ready before interacting with them, removing the need for explicit waits.

What languages can you use with Playwright?

Some of the most popular programming languages are supported by Playwright:

JavaScript – The official Playwright library is written in JS and runs on Node.js.
Python – Playwright for Python enables Python test automation.
C# – Playwright for .NET allows you to write automated tests in C#.
Java – Playwright for Java supports Java test automation.

This cross-language support makes Playwright flexible for polyglot engineering teams.

What platforms does Playwright support?

Playwright works across three platforms:

Windows – Playwright supports Windows 7, 8, 8.1 and 10. WSL is also supported.
macOS – macOS 10.14 and above is supported.
Linux – Most Linux distros work but you may need additional dependencies.

So you can run Playwright scripts on your local development machine as well as CI/CD servers.

Getting started with Playwright

The best place to get started is Playwright‘s official documentation.

You‘ll find comprehensive guides on installing, setting up and using Playwright for your programming language. We‘ll summarize the key steps below.

Install Playwright

Using NPM

For Node.js, install Playwright via NPM:

npm init playwright@latest

This will install browser binaries for Chromium, Firefox and WebKit. It also creates a tests directory with sample scripts to start writing tests.

Using other package managers

You can also install Playwright using:

Pip for Python
NuGet for .NET
Maven for Java

Documentation contains instructions for all package managers.

Install dependencies

Some additional dependencies like .NET Core or JDK may be needed depending on your environment. Refer to the docs for OS-specific requirements.

Install VS Code extension

Playwright‘s VS Code extension provides editor intellisense, debugger integration and code snippets. Search "Playwright" in the extensions marketplace to install.

Write your first test script

The NPM installation creates a sample tests/example.spec.js script you can execute:

const { test, expect } = require(‘@playwright/test‘);

test(‘basic test‘, async ({ page }) => {
  await page.goto(‘https://playwright.dev/‘);
  const title = page.locator(‘.navbar__inner .navbar__title‘);
  await expect(title).toHaveText(‘Playwright‘);
});

Run the script:

npx playwright test

This will launch browsers in headless mode and run the test.

Debugging Playwright scripts

To debug tests and see browser execution, use:

npx playwright test --debug

You can also debug individual test files:

npx playwright test tests/example.spec.js --debug

The Playwright Inspector will launch, allowing you to view browser execution and debug tests.

Why use Playwright for web automation and testing?

Let‘s look at some of the top reasons to use Playwright for test automation and interacting with web pages.

1. Fast execution with DevTools protocol

Most automation tools use the slower WebDriver protocol to control Chrome and Edge.

Playwright communicates directly with the browser via the faster DevTools protocol. This allows incredibly quick automation of browser actions.

2. Hassle-free auto-wait mechanism

Playwright‘s auto-wait mechanism is a game-changer. You don‘t have to add custom waits in your script – Playwright automatically waits for elements to be ready before clicking, typing or retrieving data.

For example, the following code clicks the element as soon as it appears, without any wait statement needed:

// Wait and click automatically
await page.click(‘button#submit‘);

This makes tests flow smoothly without unnecessary waits bogging down execution.

3. Generate scripts effortlessly with Codegen

Playwright‘s code generator records your interactions with a site and produces a complete automation script:

npx playwright codegen wikipedia.org

You can replay the script as-is or customize it further. This allows mockup of tests without manual scripting.

4. Built-in tracing and debugging

Playwright enables step-by-step debugging of scripts via Playwright Inspector. You can even replay failed test runs to diagnose issues.

Rich trace logs, screenshots and videos give you full visibility into test execution.

5. Emulate mobile browsers and devices

Test your responsive web app on multiple mobile devices with Playwright‘s device emulation:

// Emulate iPhone X
const iPhone = playwright.devices[‘iPhone X‘];

await page.emulate(iPhone);

You can preconfigure mobile and desktop emulation in your tests.

6. Detailed test reports

Playwright offers test reports in HTML, JSON and JUnit XML formats.

The HTML report provides a web dashboard with stats, test durations, failures, errors and screenshots.

JSON is great for integrating with other tools. JUnit XML allows importing results into CI/CD systems.

Using Playwright for web scraping

In additional to testing, Playwright is invaluable for web scraping and data mining tasks where rendered JavaScript is needed.

Here are some key benefits for using Playwright for scraping:

Load dynamic pages

Many sites use JavaScript to load content. Unlike request-based scrapers, Playwright can execute page JavaScript to render fully dynamic pages.

This example waits until dynamic content loads before scraping:

// Wait for #products to load
await page.waitForSelector(‘#products‘);

// Extract product data
const products = await page.$$eval(‘#products .product‘, nodes => {
  // Collect data from nodes
  return nodes.map(node => {
    return {
      name: node.querySelector(‘.name‘).textContent, 
      price: node.querySelector(‘.price‘).textContent
    }
  })
})

Much more accurate than scraping the initial empty HTML!

Bypass anti-bot solutions

Playwright can generate browser fingerprints and signatures to mimic a real user, bypassing anti-bot and anti-scraping solutions.

This is vital for scraping JavaScript-heavy sites protected against bots.

Built-in stealth options

Enable options like acceptDownloads: true to make Playwright behave more like a normal browser. This fools detection systems looking for headless browsers.

Scrape across browsers

Test scraping scripts across Chrome, Firefox and WebKit to ensure they work reliably cross-browser. Playwright makes this easy by using a single API for all browsers.

Scrape at scale

Playwright scripts can be scaled across multiple machines thanks to its headless execution and cross-platform support.

Just containerize the scripts and deploy across a Kubernetes cluster to distribute load.

Scraping websites using Playwright

Let‘s walk through a web scraping example with Playwright in Node.js.

We‘ll extract data about programming topics from GitHub‘s topic page.

Install Playwright

npm init playwright@latest

Launch browser

Launch Chromium and navigate to the URL:

const { chromium } = require(‘playwright‘); 

(async () => {

  const browser = await chromium.launch();
  const page = await browser.newPage();

  await page.goto(‘https://github.com/topics‘);

})();

Extract topics

Use page functions to extract the topic elements into an array:

// Extract topics
const topics = await page.$$eval(‘.topic-box‘, nodes => {
  return nodes.map(node => {
    return {
      name: node.querySelector(‘.f3‘).textContent,
      repoCount: node.querySelector(‘.f6‘).textContent
    }
  })
})

console.log(topics);

await browser.close();

This will output all the topic names and repository counts!

The full code is available on GitHub.

Conclusion

With its speed, reliability and language support, it‘s easy to see why Playwright is becoming the tool of choice for web automation, testing and scraping.

Its auto-wait mechanism, built-in mobile emulation, cross-browser support and concise API make Playwright a delight to work with.

If you found this guide useful, check out our Playwright tutorials on scraping, automation and end-to-end testing to unlock the full power of Playwright!

Why use Playwright?

Is Playwright a headless browser?

How does Playwright compare to Puppeteer and Selenium?

What languages can you use with Playwright?

What platforms does Playwright support?

Getting started with Playwright

Install Playwright

Using NPM

Using other package managers

Install dependencies

Install VS Code extension

Write your first test script

Debugging Playwright scripts

Why use Playwright for web automation and testing?

1. Fast execution with DevTools protocol

2. Hassle-free auto-wait mechanism

3. Generate scripts effortlessly with Codegen

4. Built-in tracing and debugging

5. Emulate mobile browsers and devices

6. Detailed test reports

Using Playwright for web scraping

Load dynamic pages

Bypass anti-bot solutions

Built-in stealth options

Scrape across browsers

Scrape at scale

Scraping websites using Playwright

Install Playwright

Launch browser

Extract topics

Conclusion

Join the conversation Cancel reply

Related Posts

What‘s the Difference Between Web Scraping and Crawling?

What are some BeautifulSoup alternatives for HTML parsing in Python?

How to Web Scrape with HTTPX and Python