├── .gitignore ├── README.md ├── index.css ├── index.html ├── index.js └── package.json /.gitignore: -------------------------------------------------------------------------------- 1 | node_modules 2 | yarn.lock 3 | yarn-error.log* 4 | .DS_Store 5 | .idea 6 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # 🕵️ Modern tests to detect automated browser behavior 2 | 3 | The goal of this repo is to have actual relevant tests that you could use with your automation software to adequately estimate your chances for success in the modern world of web. 4 | 5 | There are many pages by different people containing various tests to detect bots. Some of these pages are 5+ years old and target techniques that are not relevant anymore. Some people think that using `puppeteer-extra-plugin-stealth` with all the options on is enough, but unfortunately, many of them are not really relevant to the current state of automation and could even hurt your fingerprints and success rate. 6 | 7 | This repo contains tests to detect some really basic stuff which is quite easy to implement on any website. It's guaranteed that all these tests are used by major anti-bot companies in their products. Moreover, each of them has their own proprietary algorithms and ideas on how to test your browser for automation. But 90% of the time when you're getting blocked or see any CAPTCHA, it's just because of these tests below. 8 | 9 | If you do any kind of browser automation, you might want to make sure that your setup pass these tests. If it doesn't, then you might not achieve any high success rates for your automation. 10 | 11 | ⚠️ The recommendation is to take care of all of these tests before you try to find high-quality proxies, adjust your automated behavior, and do any other optimizations with your pipeline. **These tests are crucial** to be passed. 12 | 13 | ➡️ You can try all the tests on this page: [https://bot-detector.rebrowser.net/](https://bot-detector.rebrowser.net/) 14 | 15 | *These tests mainly focus on Chromium automated by Puppeteer and Playwright but could also be useful for testing other automation tools.* 16 | 17 | ## How to pass all the tests? 18 | Just follow the tips on the page. Some require extra settings, some require patching your Puppeteer or Playwright with [`rebrowser-patches`](https://github.com/rebrowser/rebrowser-patches). 19 | 20 | ## What are the tests? 21 | Our goal is to keep this list in an actual state. If you would like to suggest any new tests or any adjustments, please open a new issue. Any feedback will be appreciated. 22 | 23 | ### runtimeEnableLeak 24 | By default, Puppeteer, Playwright, and other automation tools rely on the `Runtime.enable` CDP method to work with execution contexts. Any website can detect it with just a few lines of code. 25 | 26 | You can read more about it in this post: [How to fix Runtime.Enable CDP detection of Puppeteer, Playwright and other automation libraries?](https://rebrowser.net/blog/how-to-fix-runtime-enable-cdp-detection-of-puppeteer-playwright-and-other-automation-libraries-61740) 27 | 28 | Fix: use `rebrowser-patches` to disable `Runtime.enable`. 29 | 30 | ### sourceUrlLeak 31 | Puppeteer will automatically add a unique source URL to every script you run through it. It could be detected by analyzing the error stack. 32 | 33 | Fix: use [`rebrowser-patches`](https://github.com/rebrowser/rebrowser-patches) to use some custom source URL. 34 | 35 | ### mainWorldExecution 36 | Your target website could alter some really popular functions such as `document.querySelector` and track every time you use this function for your scripts. It's quite dangerous and will quickly raise a red flag against your browser. 37 | 38 | Fix: use [`rebrowser-patches`](https://github.com/rebrowser/rebrowser-patches) to run all of your scripts in isolated contexts instead of the main context. 39 | 40 | ### navigatorWebdriver 41 | Good old `navigator.webdriver`. It's Chrome's way to indicate that this browser is running by automation software. 42 | 43 | Fix: just use the `--disable-blink-features=AutomationControlled` switch when you launch your Chrome. 44 | 45 | ### bypassCsp 46 | Sometimes developers use `page.setBypassCSP(true)` to be able to run their scripts in some specific edge cases to avoid Content Security Policy (CSP) limitations. This behavior is unacceptable in any real browser as it's a high security risk. 47 | 48 | Fix: you need to change your code in a way so you don't need to call this method; basically, avoid breaking CSP. 49 | 50 | ### viewport 51 | When you run Puppeteer, by default, it uses an 800x600 viewport. Playwright uses 1280x720 as default value. 52 | 53 | It's quite noticeable and easy to detect. None of the normal users with normal browsers will have such viewports. 54 | 55 | Fix: use `defaultViewport: null` (Puppeteer) and `viewport: null` (Playwright). 56 | 57 | ### window.dummyFn 58 | The goal is to test that you can access main world objects. If you apply [`rebrowser-patches`](https://github.com/rebrowser/rebrowser-patches), then you cannot easily access the main world as all of your `page.evaluate()` scripts will be executed in an isolated world. To be able to do that, you need to use some special technique (read [How to Access Main Context Objects from Isolated Context in Puppeteer & Playwright](https://rebrowser.net/blog/how-to-access-main-context-objects-from-isolated-context-in-puppeteer-and-playwright-23741) or see rebrowser-patches repo for details). This test will help you to debug it. 59 | 60 | ### useragent 61 | Puppeteer and Playwright use Google Chrome for Testing out of the box. It's a red flag for any anti-bot system. 62 | 63 | ### pwInitScripts 64 | Playwright injects `__pwInitScripts` into the global scope of the every page by default. 65 | 66 | ### exposeFunctionLeak 67 | It's quite popular to use `page.exposeFunction()` in Puppeteer and Playwright to pass some function from nodejs to browser env. However, this method is full of leaks in both of these libraries. 68 | 69 | ## What is Rebrowser? 70 | This package is sponsored and maintained by [Rebrowser](https://rebrowser.net). We allow you to scale your browser automation and web scraping in the cloud with hundreds of unique fingerprints. 71 | 72 | Our cloud browsers have great success rates and come with nice features such as notifications if your library uses `Runtime.Enable` during execution or has other red flags that could be improved. [Create an account](https://rebrowser.net) today to get invited to test our bleeding-edge platform and take your automation business to the next level. 73 | 74 | ### Special thanks 75 | 76 | [kaliiiiiiiiii/brotector](https://github.com/kaliiiiiiiiii/brotector) 77 | -------------------------------------------------------------------------------- /index.css: -------------------------------------------------------------------------------- 1 | :root,::backdrop { 2 | --sans-font: -apple-system,BlinkMacSystemFont,"Avenir Next",Avenir,"Nimbus Sans L",Roboto,"Noto Sans","Segoe UI",Arial,Helvetica,"Helvetica Neue",sans-serif; 3 | --mono-font: Consolas,Menlo,Monaco,"Andale Mono","Ubuntu Mono",monospace; 4 | --standard-border-radius: 5px; 5 | --bg: #fff; 6 | --accent-bg: #f5f7ff; 7 | --text: #212121; 8 | --text-light: #585858; 9 | --border: #898ea4; 10 | --accent: #0d47a1; 11 | --accent-hover: #1266e2; 12 | --accent-text: var(--bg); 13 | --code: #d81b60; 14 | --preformatted: #444; 15 | --marked: #fd3; 16 | --disabled: #efefef 17 | } 18 | 19 | @media (prefers-color-scheme: dark) { 20 | :root,::backdrop { 21 | color-scheme:dark; 22 | --bg: #212121; 23 | --accent-bg: #2b2b2b; 24 | --text: #dcdcdc; 25 | --text-light: #ababab; 26 | --accent: #ffb300; 27 | --accent-hover: #ffe099; 28 | --accent-text: var(--bg); 29 | --code: #f06292; 30 | --preformatted: #ccc; 31 | --disabled: #111 32 | } 33 | 34 | img,video { 35 | opacity: .8 36 | } 37 | } 38 | 39 | html { 40 | font-family: var(--sans-font); 41 | } 42 | 43 | body { 44 | color: var(--text); 45 | background-color: var(--bg); 46 | font-size: 1.15rem; 47 | line-height: 1.5; 48 | margin: 15px 50px; 49 | } 50 | 51 | .text-secondary { 52 | color: var(--text-light); 53 | } 54 | 55 | .text-secondary a { 56 | color: var(--text-light) !important; 57 | } 58 | 59 | a,a:visited { 60 | color: var(--accent) 61 | } 62 | 63 | a:hover { 64 | text-decoration: none 65 | } 66 | 67 | h1 { 68 | font-size: 3rem; 69 | margin-top: 0px; 70 | margin-bottom: 0px; 71 | } 72 | 73 | table { 74 | border-collapse: collapse; 75 | margin: 1.5rem 0; 76 | width: 100%; 77 | } 78 | 79 | td,th { 80 | border: 1px solid var(--border); 81 | text-align: start; 82 | padding: .5rem 83 | } 84 | 85 | th { 86 | background-color: var(--accent-bg); 87 | font-weight: 700; 88 | white-space: nowrap; 89 | } 90 | 91 | pre { 92 | font-family: var(--mono-font); 93 | margin: 10px 0 0 0; 94 | font-size: 80%; 95 | } 96 | 97 | code { 98 | font-family: var(--mono-font); 99 | color: var(--code); 100 | } 101 | 102 | .text-nowrap { 103 | white-space: nowrap; 104 | } 105 | 106 | #detections-json { 107 | width: 100%; 108 | height: 450px; 109 | } 110 | 111 | .dashed { 112 | border-bottom: 1px dashed; 113 | } 114 | 115 | .cursor-pointer { 116 | cursor: pointer; 117 | } 118 | 119 | .d-none { 120 | display: none; 121 | } 122 | 123 | .codeblock { 124 | border-left: 3px solid var(--code); 125 | white-space: pre-line; 126 | padding: 0 1.5em; 127 | } 128 | 129 | p { 130 | margin: .5em 0; 131 | } 132 | 133 | footer { 134 | margin-top: 1em; 135 | font-size: 80%; 136 | color: var(--text-light); 137 | } 138 | 139 | footer a { 140 | color: var(--text-light); 141 | } 142 | -------------------------------------------------------------------------------- /index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 |
4 | 5 | 6 |These tests are designed for Chromium based browsers only.
16 |To properly trigger all the tests you need to add the code below to your automation script and open this page again.
17 |18 | /* puppeteer & playwright */ 19 | // dummyFn - must be called in the main context 20 | await page.evaluate(() => window.dummyFn()) 21 | 22 | // exposeFunctionLeak 23 | await page.exposeFunction('exposedFn', () => { console.log('exposedFn call') }) 24 | 25 | // sourceUrlLeak 26 | await page.evaluate(() => document.getElementById('detections-json')) 27 | 28 | // mainWorldExecution - must be called in an isolated context 29 | /* puppeteer */ 30 | await page.mainFrame().isolatedRealm().evaluate(() => document.getElementsByClassName('div')) 31 | 32 | /* 33 | playwright - there is no way to explicitly evaluate script in an isolated context 34 | follow rebrowser-patches on github for the fix 35 | */ 36 | await page.evaluate(() => document.getElementsByClassName('div')) 37 |38 |
Test name | 44 |Time since load | 45 |Notes | 46 |
---|
window.dummyFn()
to test if you can access main world objects.',
9 | })
10 |
11 | window.dummyFn = () => {
12 | addDetection({
13 | type: 'dummyFn',
14 | rating: -1,
15 | note: 'window.dummyFn()
was called! It means you can interact with main world objects.',
16 | })
17 | return true
18 | }
19 | }
20 |
21 | function runtimeEnableLeakInit() {
22 | const testRuntimeEnableLeak = async () => {
23 | if (window.runtimeEnableLeakVars.stackLookupCount > 0) {
24 | addDetection({
25 | type: 'runtimeEnableLeak',
26 | rating: 1,
27 | note: `
28 | Runtime.enable
. ${usePatchesTip}window.exposedFn
. Use page.exposeFunction
to trigger this test.'
75 | } else if (window.exposedFn.toString()?.includes('This is the Puppeteer binding')) {
76 | detection.rating = 1
77 | detection.note = `
78 | page.exposeFunction
.page.exposeFunction
from your code to avoid this leak.page.exposeFunction
.page.exposeFunction
from your code to avoid this leak.page.exposeFunction
.page.exposeFunction
from your code to avoid this leak.page.exposeFunction
as it creates window.__playwright__binding__
object.page.exposeFunction
from your code to avoid this leak.page.exposeFunction
. It's detected because the exposed function has a property __installed = true
.page.exposeFunction
from your code to avoid this leak.window.__pwInitScripts
object.window.__pwInitScripts
detected.',
166 | })
167 |
168 | testPwInitScripts()
169 | }
170 |
171 | function testNavigatorWebdriver() {
172 | let note
173 | let debug
174 | if (navigator.webdriver === true) {
175 | note = 'navigator.webdriver = true
indicates that browser is automated. Use --disable-blink-features=AutomationControlled
switch for Chrome.'
176 | debug = `typeof navigator.webdriver = ${typeof navigator.webdriver}; navigator.webdriver = ${navigator.webdriver}`
177 | } else if (typeof navigator.webdriver === 'undefined') {
178 | note = 'This property shouldn\'t be undefined. You might have it deleted manually.'
179 | debug = `typeof navigator.webdriver = ${typeof navigator.webdriver}`
180 | } else if (Object.getOwnPropertyNames(navigator).length !== 0) {
181 | note = 'Object.getOwnPropertyNames(navigator)
should return empty array.'
182 | debug = `Object.getOwnPropertyNames(navigator) = ${JSON.stringify(Object.getOwnPropertyNames(navigator))}`
183 | } else if (Object.getOwnPropertyDescriptor(navigator, 'webdriver') !== undefined) {
184 | note = 'Object.getOwnPropertyDescriptor(navigator, \'webdriver\')
should return undefined.'
185 | debug = `Object.getOwnPropertyDescriptor(navigator, 'webdriver') = ${Object.getOwnPropertyDescriptor(navigator, 'webdriver')}`
186 | }
187 |
188 | if (note) {
189 | addDetection({
190 | type: 'navigatorWebdriver',
191 | rating: 1,
192 | debug,
193 | note,
194 | })
195 | } else {
196 | addDetection({
197 | type: 'navigatorWebdriver',
198 | rating: -1,
199 | note: 'No webdriver presented.',
200 | })
201 | }
202 | }
203 |
204 | function testViewport() {
205 | let note
206 | const width = Math.max(document.documentElement.clientWidth || 0, window.innerWidth || 0)
207 | const height = Math.max(document.documentElement.clientHeight || 0, window.innerHeight || 0)
208 | if (width === 800 && height === 600) {
209 | note = 'Viewport has default Puppeteer values. Use defaultViewport: null
in options.'
210 | } else if (width === 1280 && height === 720) {
211 | note = 'Viewport has default Playwright values. Use viewport: null
in options.'
212 | }
213 |
214 | const debug = {
215 | width,
216 | height,
217 | }
218 | if (note) {
219 | addDetection({
220 | type: 'viewport',
221 | rating: 1,
222 | debug,
223 | note,
224 | })
225 | } else {
226 | addDetection({
227 | type: 'viewport',
228 | rating: -1,
229 | debug,
230 | note: 'Viewport is different from default values used in automation libraries.',
231 | })
232 | }
233 | }
234 |
235 | async function testUseragent() {
236 | if (typeof navigator.userAgentData === undefined) {
237 | addDetection({
238 | type: 'useragent',
239 | rating: 0,
240 | note: 'Cannot detect Chrome version as navigator.userAgentData is undefined.',
241 | })
242 | return
243 | }
244 |
245 | let rating
246 | let note
247 | const debug = {}
248 | const useragentVersionItems = await navigator.userAgentData.getHighEntropyValues(['fullVersionList']).then(ua => ua.fullVersionList?.filter(item => ['Chromium', 'Google Chrome'].includes(item.brand)) || [])
249 | debug.useragentVersionItems = useragentVersionItems
250 | const useragentVersionItemsBrands = useragentVersionItems.map(item => item.brand)
251 |
252 | if (!useragentVersionItems.length) {
253 | note = 'Cannot detect Chrome version. These tests are designed for Chromium based browsers only.'
254 | rating = .5
255 | } else if (useragentVersionItemsBrands.includes('Chromium') && !useragentVersionItemsBrands.includes('Google Chrome')) {
256 | note = `
257 | navigator.userAgentData
. You might be using Google Chrome for Testing which is a red flag.executablePath
and use Google Chrome (stable channel).Page.setBypassCSP
(Puppeteer) or bypassCSP: true
(Playwright). It\'s invalid behavior for a normal browser.',
344 | })
345 | }
346 | document.head.appendChild(script)
347 | }
348 |
349 | function testMainWorldExecution() {
350 | addDetection({
351 | type: 'mainWorldExecution',
352 | rating: 0,
353 | note: 'Call document.getElementsByClassName(\'div\')
to trigger this test. If you did and the test wasn\'t triggered, then you\'re running it in an isolated world, which is safe and not detectable.',
354 | })
355 |
356 | document.getElementsByClassName = (function (original) {
357 | return function () {
358 | addDetection({
359 | type: 'mainWorldExecution',
360 | rating: 1,
361 | note: 'You\'ve called document.getElementsByClassName()
in the main world. Use rebrowser-patches to run your scripts in an isolated world.',
362 | debug: {
363 | args: Object.values(arguments),
364 | },
365 | })
366 | return original.apply(this, arguments)
367 | }
368 | }(document.getElementsByClassName))
369 | }
370 |
371 | function testSourceUrl() {
372 | function testSourceUrlError() {
373 | const error = new Error('Detection Error')
374 | let note
375 | const debug = error.stack.toString()
376 | if (error.stack.toString().includes('pptr:')) {
377 | note = 'Error stack contains pptr:
. You\'re using unpatched Puppeteer.'
378 | } else if (error.stack.toString().includes('UtilityScript.')) {
379 | note = 'Error stack contains UtilityScript.
. You\'re using unpatched Playwright.'
380 | }
381 |
382 | if (note) {
383 | addDetection({
384 | type: 'sourceUrlLeak',
385 | rating: 1,
386 | note: `${note} ${usePatchesTip}`,
387 | debug,
388 | })
389 | } else {
390 | addDetection({
391 | type: 'sourceUrlLeak',
392 | rating: -1,
393 | note: 'Error stack doesn\'t contain anything suspicious.',
394 | debug,
395 | })
396 | }
397 | }
398 |
399 | document.getElementById = (function (original) {
400 | return function () {
401 | testSourceUrlError()
402 | return original.apply(this, arguments)
403 | }
404 | })(document.getElementById)
405 |
406 | addDetection({
407 | type: 'sourceUrlLeak',
408 | rating: 0,
409 | note: 'Call document.getElementById(\'detections-json\')
to test sourceUrl leak.',
410 | })
411 | }
412 |
413 | function addDetection(data) {
414 | if (data.rating === undefined) {
415 | data.rating = 1
416 | }
417 |
418 | const existingDetection = detections.find(d => d.type === data.type)
419 | if (existingDetection !== undefined) {
420 | if (data.once) {
421 | return
422 | }
423 |
424 | if (data.rating === existingDetection.rating && data.note === existingDetection.note) {
425 | // no changes, ignore
426 | return
427 | }
428 | }
429 |
430 | console.log('addDetection', data)
431 |
432 | data.msSinceLoad = parseFloat((window.performance.now() - window.startTime).toFixed(3))
433 | if (data.replace === false) {
434 | window.detections.push(data)
435 | } else {
436 | const existingIndex = window.detections.findIndex(d => d.type === data.type)
437 | if (existingIndex === -1) {
438 | window.detections.push(data)
439 | } else {
440 | window.detections[existingIndex] = data
441 | }
442 | }
443 |
444 | renderDetections()
445 | }
446 |
447 | function renderDetections() {
448 | const tbody = document.createElement('tbody')
449 |
450 | const addRow = (cols) => {
451 | const row = tbody.insertRow(-1)
452 | for (const colNum in cols) {
453 | const cell = row.insertCell(colNum)
454 | cell.innerHTML = cols[colNum]
455 | row.appendChild(cell)
456 | }
457 | }
458 |
459 | for (const detection of window.detections) {
460 | let ratingIcon = '🔴'
461 | if (detection.rating === 0.5) {
462 | ratingIcon = '🟡'
463 | } else if (detection.rating === 0) {
464 | ratingIcon = '⚪️'
465 | } else if (detection.rating < 0) {
466 | ratingIcon = '🟢'
467 | }
468 |
469 | addRow([
470 | `${ratingIcon} ${detection.type}`,
471 | `${detection.msSinceLoad} ms`,
472 | `
473 | ${detection.note}
474 | ${!detection.debug ? '' : `
475 | ${typeof detection.debug === 'string' ? detection.debug : JSON.stringify(detection.debug, null, 2)}476 | `} 477 | `, 478 | ]) 479 | } 480 | 481 | document.querySelector('#detections-table tbody').replaceWith(tbody) 482 | document.querySelector('#detections-json').value = JSON.stringify(detections, null, 2) 483 | } 484 | 485 | function initTests() { 486 | window.detections = [] 487 | window.startTime = window.performance.now() 488 | dummyFnInit() 489 | testSourceUrl() 490 | testMainWorldExecution() 491 | runtimeEnableLeakInit() 492 | testExposeFunctionLeakInit() 493 | testNavigatorWebdriver() 494 | testCsp() 495 | testViewport() 496 | testUseragent() 497 | testPwInitScriptsInit() 498 | } 499 | 500 | function toggleHowTo() { 501 | document.querySelector('#how-to-run-test').classList.toggle('d-none') 502 | } 503 | 504 | async function main() { 505 | initTests() 506 | } 507 | 508 | window.onload = main 509 | -------------------------------------------------------------------------------- /package.json: -------------------------------------------------------------------------------- 1 | { 2 | "name": "rebrowser-bot-detector", 3 | "version": "1.0.0", 4 | "description": "Modern tests to detect automated browser behavior. Cover most important leaks from Puppeteer and Playwright.", 5 | "keywords": [ 6 | "automation", 7 | "bot", 8 | "bot-detection", 9 | "crawler", 10 | "crawling", 11 | "chromedriver", 12 | "webdriver", 13 | "headless", 14 | "headless-chrome", 15 | "stealth", 16 | "captcha", 17 | "scraping", 18 | "web-scraping", 19 | "cloudflare", 20 | "datadome", 21 | "puppeteer", 22 | "puppeteer-extra", 23 | "selenium", 24 | "rebrowser", 25 | "rebrowser-patches", 26 | "playwright" 27 | ], 28 | "author": { 29 | "name": "Rebrowser", 30 | "email": "info@rebrowser.net", 31 | "url": "https://rebrowser.net" 32 | }, 33 | "contributors": [ 34 | "Nick Webson