Are you looking for a way to bypass Cloudflare detection when coding your bot in Python? Then you are on the right page. The article below provides a guide on how to bypass Cloudflare using Python and Selenium.
Web bots, including web scrapers, have advanced a lot over the years. And let's face it; websites are also becoming smarter at detecting bot traffic. One of the game changers that make bot developers sweat is the Cloudflare anti-bot systems.
It acts as a middleware or a proxy between web servers and client software. If you send a web request, it will have to check to make that it is non-spam and legitimate before it will allow it to pass through to your target website.
Regular Internet users experience a little delay and get the “checking your browser before accessing ….” message on the screen. But this will eventually pass. However, if you are using a bot, you will most likely not be allowed access. Some developers think using a browser automator like Selenium would do the magic for them too.
Unfortunately, Cloudflare is built to detect such too. So what do you do, and how do you bypass Cloudflare detection as a bot developer using Python and Cloudflare? In this article, you will be shown how to bypass Cloudflare detection using Python and Selenium.
An Overview of Selenium
Selenium web driver is a browser automator. What you do with that is up to you. Some use it for site testing, others for botting and scraping. It is a versatile tool, as you can use it in multiple popular programming languages such as Python, Java, and NodeJS.
However, Selenium is only effective against websites with basic anti-spam systems. With the help of proxies, clearing cookies, setting random delays, and a few other methods, you are able to evade detection and block.
But when a website is protected by anti-spam systems like Cloudflare and Akamai, Selenium becomes ineffective. This is because there are default pointers that anti-spam systems use to detect bots that the default Selenium tool has.
How to Bypass Cloudflare Using Selenium and Python
With the right steps, Selenium can bypass Cloudflare easily. Before going into that, let's take a look at how well Cloudflare works in detecting bots coded with Python and Selenium. To do this, we will code a simple bot that tries to access rayobyte.com.
Rayobyte is a proxy provider protected by Cloudflare. If you try accessing it with a browser, your browser needs to be checked before you will be granted access. We’ll use this to test out how effective Cloudflare is and then code another bot that will incorporate measures to bypass it.
Step 1: Install Necessary Tools
For you to code a bot in Python, you need to have Python installed. For this project, you also need to have Selenium installed.
For most systems, Python is already installed. However, the version installed is Python 2, which is used for legacy reasons. You will need to install Python 3 to use Selenium. Visit the official download page of Python to have it installed on your system. It is available for Windows, macOS, and Linux. To verify if the installation was done successfully, run the command below in your command prompt.
Selenium is a complete botting tool. It automates the browser, allowing you to access pages, click buttons, scroll, and even fill forms and carry out any action you can carry out manually. Selenium is a third-party tool.
For it to work, you will need to have it installed and then download and place the specific driver for the browser you want to automate in path.
For this guide, we will be using Chrome since it is the most popular browser out there.
To install Selenium, run the
“pip install selenium”
command in the command prompt. Once the download is complete, you can now visit the download page for Chrome web driver. C
heck the version of Chrome you have and download the driver specifically for your Chrome version. If you download that of another version, it will not work.
Once downloaded, unzip the content in a folder. That folder will be the work folder for this project.
Step 2: Send Request to Website without Bypass Trick
Our target website is rayobyte.com. We’ll code a script that will send a request to this website and see the response we get.
Below is the code. It is quite simple. The one that requires more lines of code is actually the one with the code to bypass Cloudflare.
Use the code below in your favourite Python IDLE. In my case, I am using Pycharm, which is currently the best IDLE in the market — this is disputable though. The major problem with it is that it is paid.
from selenium.webdriver import Chrome browser = Chrome() browser.get("https://rayobyte.com")
The code above will launch the Chrome browser on your system and will try to access the Rayobyte website homepage.
However, instead of accessing the page, it will just be looping and remain on the Cloudflare verification page. Below is a screenshot of what the page looks like.
If you check the code well, you will see that I didn’t close it. This is to enable me to see if it will end up allowing the page to load. But that will never happen. You should close the automated browser.
As you have already seen, Selenium, by itself alone, can’t be used to bypass Cloudflare. If you want to bypass Cloudflare, you must make use of some tricks. The next step will show you how to use the same Selenium and Python to bypass Cloudflare.
Step 3: Using Plugin to Bypass Cloudflare
As you can see from the above, Cloudflare detects Selenium scripts as bots. Using proxies will not help you in this case. You need other methods.
The best way to bypass Cloudflare with Selenium and python is to use a library known as Undetected ChromeDriver. You can install this using the
“pip install undetected-chromedriver”
This plugin only works if you want to drive/automate Chrome. Currently, there are no options for automating other browsers. If you have this library installed, all you have to do is replace the default browser class in Selenium with this, and you are good to go.
Below is a code snippet showing you how to correctly use the Undetected ChromeDriver to bypass Cloudflare.
import undetected_chromedriver as uc from selenium.webdriver.support.ui import WebDriverWait driver = uc.Chrome(use_subprocess=True) wait = WebDriverWait(driver, 20) driver.get("https://rayobyte.com")
By just using the undetected-chromedriver library, you will see that you are able to evade detection by Cloudflare. This will enable you to automate your tasks or even scrape the web for data with no issues.
Place of Proxies for Bypassing Cloudflare
If you look at the code above, you will see that proxies were not used in the whole process. You now begin to wonder if you need proxies to bypass Cloudflare. The reason for this is simple — we only send one request. If you only need to send a few requests, you do not need to make use of a proxy to do that.
However, if you will be sending many requests as most bots do, then you need to make use of proxies.
This is because, as with most anti-bot systems, IP tracking is still one of the major parts of the Cloudflare service. And when it gets too many requests from the same IP, it does not matter whether there is a bot footprint or not; such IP will be regarded as suspicious, and as such, further requests will be blocked.
We recommend you make use of residential proxies to bypass Cloudflare. Some of the best proxy providers for these include Bright Data, Smartproxy, and Soax. If you don’t need to maintain sessions, using rotating proxies from these providers is the best for bypassing Cloudflare.
Q. Do Proxies Protect Against Cloudflare Blockage?
No, they do not. You might see some proxies market themselves as proxies to bypass Cloudflare. The reality on the ground is the contrary. Proxies alone will not protect you against Cloudflare blockage. You need to make use of tools that can mimic regular user browsers, and that is where using the undetected ChromeDriver comes in.
Proxies are required when you need to send many requests via Cloudflare, and not using proxies risk tripping off their IP tracking and blocking system. Proxies are needed in many cases, but they are not the sole tools you need to evade Cloudflare bypass.
Q. How Effective is Cloudflare at Preventing Bots?
If your target website has Cloudflare protecting it, then you really need to be worried. This is because Cloudflare has some interesting numbers that will scare you aware without even trying to bypass them.
According to data available, websites protected by Cloudflare notice a 65 percent reduction in requests before setting up Cloudflare. This is some great work Cloudflare is doing right there. It also helps websites load faster and use less bandwidth. However, with the right techniques and tools, you can still bypass it and make it less effective.
Q. Is It Legal to Bypass Cloudflare Detection?
Anti-bot systems like Cloudflare protect websites from DDoS attacks and other forms of spam. Bypassing them is not illegal, even though websites configure them to protect their systems and databases.
However, what you do after bypassing them could put you in legal trouble. If you only need to automate your tasks without causing any harm to the website by overwhelming it with requests, then you are still within the legal frame.
We are not competent legal advisers, and as such, we recommend you seek legal advice from competent practitioners. Nothing you read here should be seen or taken as legal advice.
As a bot developer, anti-bot systems like Cloudflare are some of the nightmares you are going to be dealing it, as they can frustrate you. This is especially true if you do not have experience bypassing them.
As a beginner, you might think using Selenium is an easy way out for you since Selenium automates browsers and renders JS.
However, Cloudflare seems to have gotten hold of Selenium-based bots. With the help of the undetected ChromeDriver tool described in the article, you should be able to bypass it with no issues.