Turn any website into a JSON API — declaratively.
toapi lets you point at a web page, declare the fields you want with CSS
selectors, and get back a clean JSON API. No crawler to babysit, no database to
maintain — pages are fetched and parsed on demand, with built‑in caching.
pip install toapi
Requires Python 3.10+.
from htmlparsing import Attr, Text
from toapi import Api, Item
api = Api()
@api.site("https://news.ycombinator.com")
@api.list(".athing")
@api.route("/posts", "/news")
@api.route("/posts?page={page}", "/news?p={page}")
class Post(Item):
title = Text(".titleline > a")
url = Attr(".titleline > a", "href")
api.run(host="127.0.0.1", port=5000)
Run it:
python app.py
Then visit http://127.0.0.1:5000/posts and you get:
{
"Post": [
{"title": "Mathematicians Crack the Cursed Curve", "url": "https://www.quantamagazine.org/..."},
{"title": "Stuffing a Tesla Drivetrain into a 1981 Honda Accord", "url": "https://jalopnik.com/..."}
]
}
┌────────────┐ ┌────────────┐ ┌────────────┐
│ /posts │ ─▶ │ fetch │ ─▶ │ parse │ ─▶ JSON
│ (route) │ │ (cache) │ │ (Item) │
└────────────┘ └────────────┘ └────────────┘
@api.route("/posts", "/news") maps your API path to a source URL.requests (or a headless browser if you pass browser=) and cached in memory.Item extracts fields with CSS selectors via htmlparsing.{param} placeholders.clean_<field> methods to post-process values.Api(browser="/path/to/geckodriver") for JS-heavy sites.Add a clean_<fieldname> method on the Item to transform a value before it's
returned:
@api.site("https://news.ycombinator.com")
@api.route("/posts", "/news")
class Page(Item):
next_page = Attr(".morelink", "href")
def clean_next_page(self, value):
return f"/posts?{value.split('?', 1)[1]}"
git clone https://github.com/elliotgao2/toapi.git
cd toapi
uv sync # install deps into .venv
uv run pytest # run tests
uv run ruff check .
We use uv for packaging and ruff for lint + format. Pre-commit hooks keep both clean:
uv run pre-commit install
Pull requests are welcome. For non-trivial changes, please open an issue first
to discuss what you'd like to change. Make sure uv run pytest and
uv run ruff check . pass before submitting.
MIT © Elliot Gao
$ claude mcp add toapi \
-- python -m otcore.mcp_server <graph>