Skip to content
Cascading Labs QScrape VoidCrawl Yosoi

Yosoi

Yosoi is an AI-powered CSS selector discovery tool for web scraping. It uses an LLM to discover the right selectors for any page, caches them, and lets you scrape at zero marginal cost from that point forward.

How VoidCrawl Powers Yosoi

Yosoi relies on VoidCrawl as its browser automation backend. When a target page requires JavaScript rendering, VoidCrawl launches a headless browser via the Chrome DevTools Protocol , renders the page, and returns the fully hydrated DOM, giving Yosoi the complete markup it needs for accurate selector discovery.

Key Features

  • AI-powered selector discovery: Point at a URL and describe what you want; Yosoi finds the CSS selectors.
  • Selector caching: Discover once, scrape forever without further LLM calls.
  • Provider-agnostic: Works with any LLM provider (OpenAI, Anthropic, local models).
  • Contract-based extraction: Define typed data contracts and let Yosoi fill them.
  • Validation built in: Verify extracted data matches expected types and shapes.

FAQs

How does Yosoi use VoidCrawl?

Yosoi uses VoidCrawl as its browser automation backend. When a target page requires JavaScript rendering, VoidCrawl launches a headless browser, renders the page, and returns the fully hydrated DOM for Yosoi to analyse.

What is AI-powered selector discovery?

Yosoi sends a page’s HTML to an LLM and asks it to find CSS selectors that match the data you described. Once discovered, selectors are cached so future scrapes cost nothing.

Does Yosoi work with any LLM provider?

Yes. Yosoi is provider-agnostic and works with OpenAI, Anthropic, local models, or any provider that exposes a compatible API.

Can I use Yosoi without an LLM after the initial discovery?

Yes. After the first run discovers and caches selectors, all subsequent scrapes use the cached selectors directly, with no LLM calls and no cost.

References

Large Language Model. A neural network trained on large text corpora, used here for CSS selector inference from page structure.

Chrome DevTools Protocol. Google. Protocol for instrumenting, inspecting, and debugging Chromium-based browsers. https://chromedevtools.github.io/devtools-protocol/