Effortless Web Scraping for the Modern Web
Scrapy-like API with concurrent crawling, pause/resume, and streaming mode. Handle full-scale crawls with ease.
Bypass Cloudflare Turnstile and other anti-bot systems out of the box with advanced stealth capabilities.
Smart element tracking that automatically relocates elements when websites change. No more broken scrapers!
Persistent sessions with cookie and state management across requests. Multiple session types in one spider.
Built-in MCP server for AI-assisted web scraping. Extract targeted content before passing to AI.
10x faster JSON serialization and optimized performance outperforming most Python scraping libraries.
# Stealthy fetch with adaptive parsing
from scrapling.fetchers import StealthyFetcher
p = StealthyFetcher.fetch('https://example.com', headless=True)
products = p.css('.product', adaptive=True) # Survives website changes!
# Full crawler with pause/resume
from scrapling.spiders import Spider, Response
class MySpider(Spider):
name = "demo"
start_urls = ["https://example.com/"]
async def parse(self, response: Response):
for item in response.css('.product'):
yield {"title": item.css('h2::text').get()}
MySpider().start() # Press Ctrl+C to pause, restart to resume
Scrapling outperforms most Python scraping libraries in speed and efficiency.
| Library | Time (ms) | vs Scrapling |
|---|---|---|
| Scrapling | 2.02 | 1.0x |
| Parsel/Scrapy | 2.04 | 1.01x |
| Raw Lxml | 2.54 | 1.257x |
| PyQuery | 24.17 | ~12x |
| Selectolax | 82.63 | ~41x |
| MechanicalSoup | 1,549.71 | ~767.1x |
Requires Python 3.10 or higher
For full features including fetchers and browsers: