hub / github.com/VeNoMouS/cloudscraper

github.com/VeNoMouS/cloudscraper @3.0.0 sqlite

repository ↗ · DeepWiki ↗ · release 3.0.0 ↗

220 symbols 778 edges 30 files 70 documented · 32%

README

cloudscraper - Enhanced Edition

Enhanced by Zied Boughdir

Latest Release: v3.0.0 🚀 - Major Upgrade

🆕 Major New Features in v3.0.0

🛡️ Automatic 403 Error Recovery - Intelligent session refresh when 403 errors occur after prolonged use
📊 Session Health Monitoring - Proactive session management with configurable refresh intervals
🔄 Smart Session Refresh - Automatic cookie clearing and fingerprint rotation
🎯 Enhanced Stealth Mode - Improved anti-detection with human-like behavior simulation
🔧 Modern Python Support - Python 3.8+ with latest dependency versions
📦 Modern Packaging - Uses pyproject.toml and modern build tools
🧪 Comprehensive Testing - New test suite with pytest and CI/CD integration
🚀 Performance Improvements - Optimized code with better error handling

🔧 Breaking Changes

Minimum Python version: Now requires Python 3.8+
Updated dependencies: All dependencies upgraded to latest stable versions
Removed legacy code: Cleaned up Python 2 compatibility code

✅ Previous Features (Still Available)

Executable Compatibility Fix - Complete solution for PyInstaller, cx_Freeze, and auto-py-to-exe conversion
Cloudflare v3 JavaScript VM Challenge Support - Handle the latest and most sophisticated Cloudflare protection
Cloudflare Turnstile Challenge Support - Support for Cloudflare's CAPTCHA alternative
Enhanced JavaScript Interpreter Support - Improved VM-based challenge execution
Complete Protection Coverage - Now supports all Cloudflare challenge types (v1, v2, v3, Turnstile)

🔧 Improvements

🎯 Fixed User Agent Issues in Executables - Automatic fallback system for missing browsers.json
🛡️ PyInstaller Detection - Automatically detects and handles executable environments
📦 Comprehensive Fallback System - 70+ built-in user agents covering all platforms
Enhanced proxy rotation and stealth mode capabilities
Better detection and handling of modern Cloudflare protection mechanisms
Improved compatibility with all JavaScript interpreters (js2py, nodejs, native)
Updated documentation with comprehensive examples

📊 Test Results

All features tested with 100% success rate for core functionality: - ✅ Basic requests: 100% pass rate - ✅ User agent handling: 100% pass rate - ✅ Cloudflare v1 challenges: 100% pass rate - ✅ Cloudflare v2 challenges: 100% pass rate - ✅ Cloudflare v3 challenges: 100% pass rate - ✅ Stealth mode: 100% pass rate

A Python module to bypass Cloudflare's anti-bot page (also known as "I'm Under Attack Mode", or IUAM), implemented with Requests. This enhanced version includes support for Cloudflare v2 challenges, proxy rotation, stealth mode, and more. Cloudflare changes their techniques periodically, so I will update this repo frequently.

This can be useful if you wish to scrape or crawl a website protected with Cloudflare. Cloudflare's anti-bot page currently just checks if the client supports Javascript, though they may add additional techniques in the future.

Due to Cloudflare continually changing and hardening their protection page, cloudscraper requires a JavaScript Engine/interpreter to solve Javascript challenges. This allows the script to easily impersonate a regular web browser without explicitly deobfuscating and parsing Cloudflare's Javascript.

For reference, this is the default message Cloudflare uses for these sorts of pages:

Checking your browser before accessing website.com.

This process is automatic. Your browser will redirect to your requested content shortly.

Please allow up to 5 seconds...

Any script using cloudscraper will sleep for ~5 seconds for the first visit to any site with Cloudflare anti-bots enabled, though no delay will occur after the first request.

Installation

Simply run pip install cloudscraper. The PyPI package is at https://pypi.org/project/cloudscraper/

pip install cloudscraper

Alternatively, clone this repository and run python setup.py install.

Migration from cloudscraper

If you were previously using the original cloudscraper package, you can now use this enhanced version directly:

# Enhanced import
import cloudscraper  # Enhanced version

The API remains compatible, so you only need to change the import statements in your code. All function calls and parameters work the same way.

Codebase Structure

The codebase has been streamlined to improve maintainability and reduce confusion:

Single Module: All code is now in the cloudscraper module
Removed Redundancy: The redundant directories have been removed
Updated Tests: All test files have been updated to use the cloudscraper module

This makes the codebase cleaner and easier to maintain while ensuring backward compatibility with existing code that uses the original API.

Key Features in cloudscraper

Feature	Description	Status
🆕 Executable Compatibility	Complete fix for PyInstaller, cx_Freeze, auto-py-to-exe conversion	✅ FIXED
🆕 v3 JavaScript VM Challenges	Support for Cloudflare's latest JavaScript VM-based challenges	✅ NEW
🆕 Turnstile Support	Support for Cloudflare's new Turnstile CAPTCHA replacement	✅ NEW
Modern Challenge Support	Enhanced support for v1, v2, v3, and Turnstile Cloudflare challenges	✅ Complete
Proxy Rotation	Built-in smart proxy rotation with multiple strategies	✅ Enhanced
Stealth Mode	Human-like behavior simulation to avoid detection	✅ Enhanced
Browser Emulation	Advanced browser fingerprinting for Chrome and Firefox	✅ Stable
JavaScript Handling	Better JS interpreter (js2py as default) for challenge solving	✅ Enhanced
Captcha Solvers	Support for multiple CAPTCHA solving services	✅ Stable

Dependencies

Python 3.8+ (Dropped support for Python 3.6 and 3.7)
Requests >= 2.31.0
requests_toolbelt >= 1.0.0
pyparsing >= 3.1.0
pyOpenSSL >= 24.0.0
pycryptodome >= 3.20.0
websocket-client >= 1.7.0
js2py >= 0.74
brotli >= 1.1.0
certifi >= 2024.2.2

python setup.py install will install the Python dependencies automatically. The javascript interpreters and/or engines you decide to use are the only things you need to install yourself, excluding js2py which is part of the requirements as the default.

Javascript Interpreters and Engines

We support the following Javascript interpreters/engines.

ChakraCore: Library binaries can also be located here.
js2py: >=0.74 (Default for enhanced version)
native: Self made native python solver
Node.js
V8: We use Sony's v8eval() python module.

Usage

The simplest way to use cloudscraper is by calling create_scraper().

import cloudscraper

scraper = cloudscraper.create_scraper()  # returns a CloudScraper instance
# Or: scraper = cloudscraper.CloudScraper()  # CloudScraper inherits from requests.Session
print(scraper.get("http://somesite.com").text)  # => "<!DOCTYPE html><html><head>..."

That's it...

Any requests made from this session object to websites protected by Cloudflare anti-bot will be handled automatically. Websites not using Cloudflare will be treated normally. You don't need to configure or call anything further, and you can effectively treat all websites as if they're not protected with anything.

You use cloudscraper exactly the same way you use Requests. cloudScraper works identically to a Requests Session object, just instead of calling requests.get() or requests.post(), you call scraper.get() or scraper.post().

Consult Requests' documentation for more information.

✅ Executable Compatibility (v2.7.0)

Problem Solved!

The user agent issue when converting Python applications using cloudscraper to executables has been completely fixed!

What Was the Issue?

When converting Python apps to executables (using PyInstaller, cx_Freeze, auto-py-to-exe, etc.), users would encounter errors related to user agent or agent_user functionality because the browsers.json file wasn't included properly.

The Solution

cloudscraper v2.7.0 includes an automatic fallback system:

PyInstaller Detection - Automatically detects executable environments
Multiple Fallback Paths - Tries several locations for browsers.json
Comprehensive Built-in Fallback - 70+ hardcoded user agents covering all platforms
Graceful Error Handling - No more crashes when files are missing

How to Use

Option 1: Just build your executable (works automatically):

pyinstaller your_app.py

Option 2: Include full user agent database (recommended):

pyinstaller --add-data "cloudscraper/user_agent/browsers.json;cloudscraper/user_agent/" your_app.py

Testing

All executable compatibility has been thoroughly tested:

✅ Normal operation with browsers.json
✅ Fallback operation without browsers.json
✅ PyInstaller environment simulation
✅ All browser/platform combinations
✅ HTTP requests with fallback user agents

Your cloudscraper applications will now work perfectly when converted to executables! 🎉

🆕 Cloudflare v3 JavaScript VM Challenge Support

What are v3 Challenges?

Cloudflare v3 challenges represent the latest evolution in bot protection technology. Unlike traditional v1 and v2 challenges, v3 challenges:

Run in a JavaScript Virtual Machine: Challenges execute in a sandboxed JavaScript environment
Use Advanced Detection: More sophisticated algorithms to detect automated behavior
Generate Dynamic Code: Challenge code is dynamically created and harder to reverse-engineer
Provide Modern Protection: The most current anti-bot technology from Cloudflare

Basic v3 Usage

import cloudscraper

# v3 support is enabled by default
scraper = cloudscraper.create_scraper()
response = scraper.get("https://example.com")
print(response.text)

Advanced v3 Configuration

import cloudscraper

# Optimized configuration for v3 challenges
scraper = cloudscraper.create_scraper(
    interpreter='js2py',  # Recommended for v3 challenges
    delay=5,              # Allow more time for complex challenges
    debug=True            # Enable debug output to see v3 detection
)

response = scraper.get("https://example.com")
print(response.text)

v3 with Different JavaScript Interpreters

All JavaScript interpreters work with v3 challenges:

# Test different interpreters for v3 challenges
interpreters = ['js2py', 'nodejs', 'native']

for interpreter in interpreters:
    try:
        scraper = cloudscraper.create_scraper(interpreter=interpreter)
        response = scraper.get("https://example.com")
        print(f"✅ {interpreter}: Success ({response.status_code})")
    except Exception as e:
        print(f"❌ {interpreter}: Failed - {str(e)}")

v3 Challenge Detection

When debug mode is enabled, you'll see v3 challenge detection in action:

scraper = cloudscraper.create_scraper(debug=True)
response = scraper.get("https://example.com")

# Debug output will show:
# "Detected a Cloudflare v3 JavaScript VM challenge."

Performance Considerations for v3

v3 challenges are more complex and may require additional time:

# Recommended settings for v3 challenges
scraper = cloudscraper.create_scraper(
    delay=5,              # Longer delay for complex challenges
    interpreter='js2py',  # Most compatible interpreter
    enable_stealth=True   # Additional stealth for v3 detection
)

🚀 Complete Examples

Example 1: Basic Usage with All Challenge Types

import cloudscraper

# Create a scraper that handles all challenge types automatically
scraper = cloudscraper.create_scraper()

# This will automatically handle v1, v2, v3, and Turnstile challenges
response = scraper.get("https://example.com")
print(f"Status: {response.status_code}")
print(f"Content length: {len(response.text)}")

Example 2: Advanced Configuration for Maximum Compatibility

import cloudscraper

# Advanced configuration for challenging websites
scraper = cloudscraper.create_scraper(
    # Challenge handling
    interpreter='js2py',        # Best compatibility for v3 challenges
    delay=5,                    # Extra time for complex challenges

    # Stealth mode
    enable_stealth=True,
    stealth_options={
        'min_delay': 2.0,
        'max_delay': 6.0,
        'human_like_delays': True,
        'randomize_headers': True,
        'browser_quirks': True
    },

    # Browser emulation
    browser='chrome',

    # Debug mode
    debug=True
)

response = scraper.get("https://example.com")

Example 3: Handling Turnstile with CAPTCHA Solver

```python import cloudscraper

Configure with 2captcha for Turnstile

Core symbols most depended-on inside this repo

create_scraper

called by 18

cloudscraper/__init__.py

simpleException

called by 15

cloudscraper/__init__.py

request

called by 8

cloudscraper/__init__.py

dynamicImport

called by 5

cloudscraper/captcha/__init__.py

generate_fallback_response

called by 4

cloudscraper/cloudflare_v3.py

is_Captcha_Challenge

called by 4

cloudscraper/cloudflare.py

_should_refresh_session

called by 4

cloudscraper/__init__.py

template

called by 4

cloudscraper/interpreters/encapsulated.py

Shape

Method 159

Class 48

Function 12

Route 1

Languages

Python100%

Modules by API surface

tests/test_modern.py24 symbols

cloudscraper/__init__.py21 symbols

cloudscraper/exceptions.py18 symbols

cloudscraper/interpreters/native.py16 symbols

cloudscraper/cloudflare.py13 symbols

cloudscraper/stealth.py10 symbols

cloudscraper/proxy_manager.py9 symbols

cloudscraper/captcha/deathbycaptcha.py9 symbols

cloudscraper/cloudflare_v3.py8 symbols

cloudscraper/cloudflare_v2.py8 symbols

cloudscraper/captcha/2captcha.py8 symbols

cloudscraper/captcha/capsolver.py7 symbols

Used by 1 indexed graphs manifest dependencies, hub-wide

github.com/MatrixTM/MHDDoS

Dependencies from manifests, versioned

brotli1.1.0 · 1×

certifi2024.2.2 · 1×

js2py0.74 · 1×

pyOpenSSL24.0.0 · 1×

pycryptodome3.20.0 · 1×

pyparsing3.1.0 · 1×

requests2.31.0 · 1×

requests-toolbelt1.0.0 · 1×

requests_toolbelt1.0.0 · 1×

websocket-client1.7.0 · 1×

For agents

$ claude mcp add cloudscraper \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact