Skip to content
Cascading Labs QScrape VoidCrawl Yosoi

List Fields

Declare a field as list[T] when a page contains several values for the same slot — tags, prices, categories, authors. Yosoi handles two common DOM patterns automatically.

Pattern A: Separate Elements

One selector, multiple matching nodes. The extractor collects the text from each match.

<a class="tag">love</a>
<a class="tag">life</a>
<a class="tag">inspiration</a>
import yosoi as ys
class Quote(ys.Contract):
root = ys.css('div.quote')
text: str = ys.Field(description='The quote text')
author: str = ys.Author()
tags: list[str] = ys.Field(description='Topic tags for the quote')

The AI discovers a.tag and the extractor returns all matched texts as a list.

Pattern B: Delimited String

One selector, one node with a comma/semicolon-separated value. Yosoi splits it automatically.

<span>love, life, and inspiration</span>
raw = {'text': '"Be yourself."', 'author': 'Oscar Wilde', 'tags': ['love, life, and inspiration']}
result = Quote.model_validate(raw)
# result.tags == ['love', 'life', 'inspiration']

The default delimiter splits on ,, ;, and the word and.

Custom Delimiter

Override the split pattern with delimiter:

class Article(ys.Contract):
title: str = ys.Title()
categories: list[str] = ys.Field(description='Categories', delimiter=r'\s*\|\s*')
raw = {'title': 'Some Article', 'categories': ['Tech | Science | AI']}
result = Article.model_validate(raw)
# result.categories == ['Tech', 'Science', 'AI']

Per-Element Coercion

Yosoi type coercions apply element-by-element when the field is a list. Mix list[float] with ys.Price() to strip currency symbols from each value:

class PriceComparison(ys.Contract):
name: str = ys.Title()
vendor_prices: list[float] = ys.Price(description='Prices from different vendors')
raw = {'name': 'Widget', 'vendor_prices': ['$12.99', '£9.50', '€11.00']}
result = PriceComparison.model_validate(raw)
# result.vendor_prices == [12.99, 9.50, 11.00]

Pinning the Selector

If the AI discovers the wrapper element instead of the individual items, pin the selector explicitly:

tags: list[str] = ys.Field(description='Topic tags', selector='a.tag')

FAQs

Can I use list[int] or list[float]?

Yes. Type coercions apply per element, so list[float] with ys.Price() strips currency symbols from each value before converting to float.

What if the delimiter varies across items on the same page?

Use a regex delimiter that covers both cases. For example, delimiter=r'\s*[,;]\s*' handles both commas and semicolons. If the variation is too unpredictable, a custom Validators method gives you full control.

What if the AI discovers the wrong selector for a list field?

Pin it with selector='a.tag' on the field. This bypasses discovery for that field while letting the AI handle the rest.

What is returned if no elements match?

An empty list []. If you need to treat an empty list as a validation error, add a validator that checks the length.