Quick Start
Install
uv add voidcrawlMake sure Chrome or Chromium is installed on your system.
Option 1: BrowserPool (recommended)
The pool pre-opens tabs and recycles them, giving near-instant page loads after the first warmup:
import asynciofrom voidcrawl import BrowserPool, PoolConfig
async def main(): async with BrowserPool(PoolConfig()) as pool: async with pool.acquire() as tab: await tab.goto("https://qscrape.dev") print(await tab.title()) # "qScrape" print(len(await tab.content()))
asyncio.run(main())Key points:
PoolConfig()uses sensible defaults (1 browser, 4 tabs). Configure via constructor args or env vars.pool.acquire()returns aPooledTab— use it like aPage. The context manager auto-releases it back to the pool.- Tabs are recycled (navigated to
about:blank) rather than closed, making subsequent acquires near-instant.
Option 2: BrowserSession (low-level)
For direct browser control without pooling:
import asynciofrom voidcrawl import BrowserConfig, BrowserSession
async def main(): async with BrowserSession(BrowserConfig()) as session: page = await session.new_page("https://qscrape.dev") print(await page.title()) # "qScrape" print(len(await page.content())) await page.close()
asyncio.run(main())Option 3: Docker
For production, Chrome runs as a persistent daemon in Docker with pre-warmed profiles:
cd dockerdocker compose up -dThe pool connects to Chrome via CHROME_WS_URLS instead of launching it:
export CHROME_WS_URLS="http://localhost:9222,http://localhost:9223"python your_script.pySee Docker & VNC for the full guide.
Try it with QScrape
QScrape provides purpose-built fictional websites for testing scrapers. Try VoidCrawl against a QScrape target:
import asynciofrom voidcrawl import BrowserPool, PoolConfig
async def main(): async with BrowserPool(PoolConfig()) as pool: async with pool.acquire() as tab: await tab.goto( "https://qscrape.dev/l1/eshop/catalog/" "?cat=Forge%20%26%20Smithing" )
title = await tab.title() print(f"Page: {title}")
# Query product names from the DOM products = await tab.query_selector_all(".product-name") for p in products[:5]: print(f" - {p}")
asyncio.run(main())What Just Happened?
-
VoidCrawl launched a headless Chrome instance (or connected to an existing one via Docker)
-
A tab was acquired from the pool and navigated to the target URL
-
The page rendered with full JavaScript execution — VoidCrawl sees the live DOM, not raw HTML
-
DOM queries extracted content using CSS selectors, just like
document.querySelectorAllin a browser console -
The tab was released back to the pool for reuse (not closed)
Important Concepts
- Every method on
Page,PooledTab, andBrowserSessionis async — alwaysawaitthem. - Both
BrowserPoolandBrowserSessionare async context managers that ensure clean shutdown. - Stealth mode is on by default. Pass
stealth=FalsetoBrowserConfigto disable it. goto(url)andnavigate(url)are both useful but not interchangeable:goto(url, timeout=30.0)navigates and waits for network idle, returning aPageResponse(HTML, final URL, status, redirect flag). Use this when you want to read content immediately after — it’s the right default for most scraping.navigate(url)fires the navigation and returns immediately withNone. Use it when you want to control the wait yourself (e.g.await tab.wait_for_navigation(), wait on a specific selector, or race multiple conditions). Readingcontent()right afternavigate()without waiting may return an empty or partial page.
Next Steps
- Browser Pool — understand cold start vs tab reuse
- Built-in Actions — click, type, scroll, and query the DOM
- Cookbook — cookies, network logging, per-tab headers, and response handling
- Stealth Mode — how anti-detection works
- Docker & VNC — run Chrome in Docker for production
- Examples — runnable scripts for common tasks
References
△ QScrape. Cascading Labs. Purpose-built fictional scraping targets for benchmarking and testing. https://qscrape.dev