Skip to content

How to wait for page to load in Playwright?

When scraping dynamic web pages with Playwright and Python, it‘s crucial to wait for the page to fully load before trying to extract data. Here are some effective techniques to wait for a page to load completely in Playwright:

Use page.wait_for_load_state()

The simplest way is to use the page.wait_for_load_state() API. This waits until the page reaches a "load" state, meaning the page has fired its load event:

await page.goto("https://example.com")
await page.wait_for_load_state() 

You can also wait for the "networkidle" state, which waits until there are no network connections for at least 500 ms:

await page.goto("https://example.com")
await page.wait_for_load_state("networkidle")

Wait for a specific selector

Another common technique is to wait for a specific selector to appear on the page. This indicates that a certain part of the page has loaded:

await page.goto("https://example.com")
await page.wait_for_selector("div.loaded")

You may need to inspect the page and find a selector that only appears when the page has fully loaded.

Wait for navigation to finish

You can also wait for the navigation itself to finish loading with wait_for_navigation:

await page.click("a.dynamic-page") 
await page.wait_for_navigation()

This will pause execution until the navigation event completes after clicking the link.

Set a timeout

It‘s a good idea to set a timeout when waiting, so your script doesn‘t hang if the condition is never met:

await page.wait_for_selector("div.loaded", timeout=10000)

This will wait up to 10 seconds before throwing an error if the selector doesn‘t appear.

Wait between interactions

To avoid rate-limiting errors, use sleep() to add a delay between interactions:

await page.click("#submit")
await page.wait_for_navigation()
sleep(5) # wait 5 seconds

This allows time for the previous action to finish before taking the next action.

The key with Playwright is using the right events and selectors to wait for the page state you need. With the above methods, you can reliably wait for a page to load before scraping or interacting with the page.

Join the conversation

Your email address will not be published. Required fields are marked *