Skip to content
Cascading Labs QScrape VoidCrawl Yosoi

Selector Cache Types

Generated from yosoi v0.0.1a11. Only symbols in __all__ are listed.

CacheVerdict

Result of verifying a cached selector against the current page HTML.

On each scrape Yosoi tests every cached selector against the live page before deciding whether to call the LLM. The verdict determines what happens next for that field.

FieldSelectors

Container for primary, fallback, and tertiary selectors for a single contract field.

During extraction Yosoi tries the primary selector first. If it returns no elements the fallback is tried, then the tertiary. Bare strings passed to any slot are automatically coerced to SelectorEntry instances, and duplicate values across slots are deduplicated.

as_entries

as_entries() -> list[tuple[str, SelectorEntry | None]]

Return selectors as (level_name, SelectorEntry) tuples for level-aware dispatch.

as_tuples

as_tuples() -> list[tuple[str, str | None]]

Return selectors as (level_name, selector_value) tuples for backward compat.

SelectorEntry

A single selector with its strategy type and value.

Represents one concrete selector that Yosoi can evaluate against a page. Each entry carries the strategy (CSS, XPath, etc.) alongside the expression string. Bare strings default to CSS selectors.

SelectorLevel

Hierarchy of selector strategies from simplest to most complex.

Yosoi tries selectors in ascending level order. CSS is the default and covers most sites. Escalation to higher levels only happens when lower strategies fail inline verification.

SelectorSnapshot

Per-field selector data with audit metadata.

Each field in a cache file is stored as a SelectorSnapshot. It holds up to three selector candidates (primary, fallback, tertiary) plus timestamps that track when the selector was discovered, last verified, and last failed. The failure_count drives automatic staleness detection.

SnapshotMap

Top-level model for a domain’s selector cache file.

Each domain scraped by Yosoi gets one JSON cache file (e.g. selectors_example_com.json). SnapshotMap is the Pydantic model that validates and serialises that file. It maps field names to their SelectorSnapshot entries.