Here is a 2000+ word blog post on "How to load local files in Playwright?":
Playwright is a popular browser automation library for web scraping and end-to-end testing. One of the handy features of Playwright is the ability to load local files on your filesystem instead of making requests to remote servers. This allows you to test and debug your Playwright scripts offline using local test pages.
In this comprehensive guide, we‘ll cover everything you need to know about loading local files in Playwright, including:
- What are the benefits of loading local files?
- How to load HTML, JSON, images, and other files.
- Tips for handling relative file paths.
- Example code snippets for Python, JavaScript, and TypeScript.
- Common pitfalls and troubleshooting advice.
- Best practices for using local files in a CI/CD pipeline.
After reading, you‘ll have expert-level knowledge of working with local files in Playwright for mocking responses, building scrapers, and more!
Benefits of Loading Local Files
Here are some of the main benefits of loading local files in Playwright:
-
Work offline: Test and develop scripts without an internet connection. No need to rely on remote servers being available.
-
Faster performance: Fetching from the local disk is faster than making network requests.
-
Control test data: Have full control over test file contents instead of relying on unpredictable live sites.
-
Mock responses: Stub remote API responses with local JSON files.
-
Privacy: Avoid sending requests to third-party sites during development.
-
Prototype scrapers: Build scrapers against a local HTML copy before targeting live sites.
-
Consistent tests: Local files behave the same every time, giving reliable automated tests.
For these reasons, loading local files can boost productivity and test stability when working with Playwright.
Loading HTML Files
To load an HTML file in Playwright, use a file://
URL and provide the absolute file path.
Here is an example in Python:
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch()
page = browser.new_page()
page.goto("file:///home/user/local-test.html")
print(page.content())
And in JavaScript:
const { chromium } = require(‘playwright‘);
(async () => {
const browser = await chromium.launch();
const page = await browser.newPage();
await page.goto(‘file:///home/user/local-test.html‘);
console.log(await page.content());
await browser.close();
})();
The key things to note are:
- Use a
file://
URL with three slashes (/
). - Specify the absolute file path, not a relative path.
- The path should point to a static HTML file, not a directory.
- Playwright will load and process the HTML just like a normal web page.
Once loaded, you can interact with the DOM, call page methods like page.click()
, assert contents, and more.
Accessing Relative File Paths
When loading a local HTML file, relative paths won‘t work by default. For example:
<!-- local-test.html -->
<img src="images/logo.png">
The image link will fail because it uses a relative path.
To fix this, you need to provide a baseURL
when creating the browser context:
browser.new_context(baseURL="file:///home/user/")
Now relative paths will resolve correctly within this context.
Loading JSON Files
Local JSON files can be useful for stubbing API responses during development.
To load a JSON file in Playwright, use page.route()
to intercept network requests:
import json
page.route("**/data.json", lambda route: route.fulfill(
content_type="application/json",
body=json.dumps({
"mock_key": "mock_response"
}))
)
Now any requests to /data.json
will be fulfilled with the local mock JSON data.
The same approach works for mocking images, PDFs, or any other file type – just return the binary file content.
Loading Images and Other Files
To load a local image or file, set it as the src
of an HTML element:
<!-- local-test.html -->
<img src="file:///home/user/image.png">
Or directly set it as the page content:
with open("image.png", "rb") as f:
image_data = f.read()
page.set_content(image_data)
This will render the image or display the file contents.
For loading files like PDFs, you may need to set the appropriate Content-Type
header.
Playwright Code Examples
Here are some full code examples for loading different file types in Playwright.
HTML
Python
page.goto("file:///home/user/local-test.html")
JavaScript
await page.goto(‘file:///home/user/local-test.html‘);
TypeScript
await page.goto(‘file:///home/user/local-test.html‘);
JSON
Python
import json
page.route("**data.json", lambda route: route.fulfill(
content_type="application/json",
body=json.dumps({"mock_key": "mock_value"})
))
JavaScript
page.route(‘**data.json‘, route => {
route.fulfill({
contentType: ‘application/json‘,
body: JSON.stringify({mock_key: ‘mock_value‘}),
});
});
TypeScript
page.route(‘**data.json‘, route => {
route.fulfill({
contentType: ‘application/json‘,
body: JSON.stringify({mock_key: ‘mock_value‘}),
});
});
Images
Python
page.set_content(open("image.png", "rb").read())
JavaScript
const imgBuffer = fs.readFileSync(‘image.png‘);
await page.setContent(imgBuffer);
TypeScript
const imgBuffer = fs.readFileSync(‘image.png‘);
await page.setContent(imgBuffer);
PDFs
Python
with open("doc.pdf", "rb") as f:
pdf_content = f.read()
page.set_content(pdf_content, headers={"Content-Type": "application/pdf"})
JavaScript
const pdfBuffer = fs.readFileSync(‘doc.pdf‘);
await page.setContent(pdfBuffer, {
contentType: ‘application/pdf‘,
});
TypeScript
const pdfBuffer = fs.readFileSync(‘doc.pdf‘);
await page.setContent(pdfBuffer, {
contentType: ‘application/pdf‘,
});
As you can see, the approach is very similar across languages – the main differences are in how you read the file data.
Troubleshooting Local File Loading
Here are some common issues and solutions when loading local files with Playwright:
404 File Not Found
- Double check the file path is absolute, not relative.
- Verify the file exists at that location on disk.
- Check filename case sensitivity on Linux/macOS.
Cross-Origin Request Blocked
- This occurs if your test page requests resources from a remote server. Start by loading only local resources.
Mixed Content Warnings
- Can happen if page loads HTTP resources while on a HTTPS file URL. Use a file:// URL instead of https://.
Allow File Access in Chrome
- Chrome may block local file access unless you start it with
--allow-file-access-from-files
flag.
Sandbox Issues
- Some environments like Docker restrict file access. May need to launch Chrome with
--no-sandbox
.
Relative Paths Not Working
- Set a base URL on the browser context to handle relative paths correctly.
Encoding Issues
- Binary file contents may have encoding issues. Handle files as byte buffers instead of text.
Checking for these common problems will help resolve most local file loading issues.
Local File Best Practices
Here are some best practices to follow when using local files in Playwright:
-
Keep production and test code separate – Don‘t use local files in your main codebase. Only use them in tests.
-
Commit local files to source control – Add your local test files to Git/GitHub to share with other developers.
-
Use descriptive filenames – Like
mock-api-response.json
instead offile1.json
. -
Load once, reuse everywhere – Load local files in
before()
hooks and reuse across tests. -
Use variables for paths – Avoid hardcoding file paths; use variables like
LOCAL_HTML_PATH
instead. -
Serve files locally – For full end-to-end tests, run a local dev server to serve test files.
-
Clean up when done – Delete temporary local files after your test run finishes.
By following these tips, you can robustly incorporate local files into your Playwright test suites.
Using Local Files in CI/CD
For CI/CD environments like GitHub Actions, there are a couple useful techniques for dealing with local test files:
-
Commit files directly – Add test files directly to your repo. Then GitHub Actions can access them.
-
Bundle files in workflow – Upload test files as workflow artifacts that get passed between jobs.
-
Generate files dynamically – Have CI workflow generate files on the fly to avoid committing them.
-
Use file server – Run a local file server within CI and access files over HTTP.
-
Cache files – Cache local test files between CI runs for faster performance.
Overall, it‘s best to avoid relying on the CI server‘s local filesystem if possible. Committing files directly keeps things simple in most cases.
Conclusion
Loading local files is a handy trick for creating faster, more reliable tests with Playwright. Mocking responses, previewing scrapers, and working offline are just a few benefits.
With Playwright‘s file:// protocol, routing features, and content manipulation APIs, you have all the tools needed to incorporate local files into your browser automation scripts. Just be sure to use absolute file paths and handle binary data with care.
Following the examples and best practices in this guide will give you expert-level familiarity with local file loading in Playwright. So ditch those remote servers while developing your next web scraping or testing tool!