Quick Start

Install

uv add voidcrawl

Make sure Chrome or Chromium is installed on your system.

Option 1: BrowserPool (recommended)

The pool pre-opens tabs and recycles them, giving near-instant page loads after the first warmup:

import asyncio
from voidcrawl import BrowserPool, PoolConfig

async def main():
    async with BrowserPool(PoolConfig()) as pool:
        async with pool.acquire() as tab:
            await tab.goto("https://qscrape.dev")
            print(await tab.title())   # "qScrape"
            print(len(await tab.content()))

asyncio.run(main())

Key points:

PoolConfig() uses sensible defaults (1 browser, 4 tabs). Configure via constructor args or env vars.
pool.acquire() returns a PooledTab — use it like a Page. The context manager auto-releases it back to the pool.
Tabs are recycled (navigated to about:blank) rather than closed, making subsequent acquires near-instant.

Option 2: BrowserSession (low-level)

For direct browser control without pooling:

import asyncio
from voidcrawl import BrowserConfig, BrowserSession

async def main():
    async with BrowserSession(BrowserConfig()) as session:
        page = await session.new_page("https://qscrape.dev")
        print(await page.title())   # "qScrape"
        print(len(await page.content()))
        await page.close()

asyncio.run(main())

Option 3: Docker

For production, Chrome runs as a persistent daemon in Docker with pre-warmed profiles:

cd docker
docker compose up -d

The pool connects to Chrome via CHROME_WS_URLS instead of launching it:

export CHROME_WS_URLS="http://localhost:9222,http://localhost:9223"
python your_script.py

See Docker & VNC for the full guide.

Try it with QScrape

QScrape provides purpose-built fictional websites for testing scrapers. Try VoidCrawl against a QScrape target:

import asyncio
from voidcrawl import BrowserPool, PoolConfig

async def main():
    async with BrowserPool(PoolConfig()) as pool:
        async with pool.acquire() as tab:
            await tab.goto(
                "https://qscrape.dev/l1/eshop/catalog/"
                "?cat=Forge%20%26%20Smithing"
            )

            title = await tab.title()
            print(f"Page: {title}")

            # Query product names from the DOM
            products = await tab.query_selector_all(".product-name")
            for p in products[:5]:
                print(f"  - {p}")

asyncio.run(main())

What Just Happened?

VoidCrawl launched a headless Chrome instance (or connected to an existing one via Docker)
A tab was acquired from the pool and navigated to the target URL
The page rendered with full JavaScript execution — VoidCrawl sees the live DOM, not raw HTML
DOM queries extracted content using CSS selectors, just like document.querySelectorAll in a browser console
The tab was released back to the pool for reuse (not closed)

Important Concepts

Every method on Page, PooledTab, and BrowserSession is async — always await them.
Both BrowserPool and BrowserSession are async context managers that ensure clean shutdown.
Stealth mode is on by default. Pass stealth=False to BrowserConfig to disable it.
goto(url) and navigate(url) are both useful but not interchangeable:
- goto(url, timeout=30.0) navigates and waits for network idle, returning a PageResponse (HTML, final URL, status, redirect flag). Use this when you want to read content immediately after — it’s the right default for most scraping.
- navigate(url) fires the navigation and returns immediately with None. Use it when you want to control the wait yourself (e.g. await tab.wait_for_navigation(), wait on a specific selector, or race multiple conditions). Reading content() right after navigate() without waiting may return an empty or partial page.

Next Steps

Browser Pool — understand cold start vs tab reuse
Built-in Actions — click, type, scroll, and query the DOM
Cookbook — cookies, network logging, per-tab headers, and response handling
Stealth Mode — how anti-detection works
Docker & VNC — run Chrome in Docker for production
Examples — runnable scripts for common tasks

References

△ QScrape. Cascading Labs. Purpose-built fictional scraping targets for benchmarking and testing. https://qscrape.dev