capability proof

What Haunt can extract.

Haunt is strongest when you send a public URL plus a plain-English prompt and want structured JSON back. Use this guide as a current capability map, not a promise that every target will work.

Try demo Make first call Get free key

capability map

Green lanes: normal public pages, docs, APIs, product pages, metadata, GitHub repositories, Reddit, and HN public routes.

Yellow lanes: public social or JavaScript-heavy pages where visible content may be partial.

Red lanes: login walls, human verification, private data, and unsupported social profile scraping.

green lanes

Good fit.

Use Haunt where public or authorised content is visible enough to support evidence-backed extraction.

Company websites

Extract names, descriptions, positioning, pricing links, product text, metadata, and public business facts.

Docs and API pages

Turn public docs, API references, OpenAPI JSON, changelogs, and reference pages into structured JSON.

GitHub repositories

GitHub repositories and public metadata are a strong fit through dedicated API-backed routes and normal extraction.

Product and pricing pages

Extract public plan names, prices, product lists, descriptions, and table-shaped content when present in the page.

JSON, XML, OpenAPI

Structured source formats are among Haunt's strongest source categories.

Reddit and HN public routes

Use dedicated routes where public data paths are stable instead of forcing a generic browser scrape.

yellow lanes

Sometimes useful, with boundaries.

Target	What works	Boundary
LinkedIn company pages	Public metadata, title, description, company shell, sometimes website and employee signals.	No people scraping, comments, members, logged-in views, or post history promises.
Instagram business profiles	OpenGraph/profile metadata when exposed publicly.	Metadata only. Not followers, comments, private media, or logged-in content.
Facebook business pages	Partial public page-card metadata when visible.	Not a reliable social feed scraper.
Heavy JavaScript apps	Sometimes works through bounded browser rendering.	Timeouts and thin shells are possible.

Red lanes.

X/Twitter: unsupported for generic extraction because of heavy anti-bot and login-wall behaviour.

CAPTCHA and human verification: clean failure. Haunt is CAPTCHA-aware and returns an explicit human-verification error instead of guessing.

Private/login-only pages: Pro/Scale authorised extraction only. Use sessions, cookies, or headers you are allowed to provide.

Personal profiles, comments, groups, followers: not a product promise.

use the right route

POST /v1/extract
GET /v1/github/repo?owner=browserbase&repo=stagehand
GET /v1/hackernews/item/40310896
POST /v1/company/enrich
POST /v1/reddit
POST /v1/reddit/comments

next path

Pick the page that matches your job.

Each one takes you from a URL to a working call for that one job, nothing else.

AI agent web extraction

Demo, key, then first agent call.

Open AI agent path

Turn URL into JSON

URL plus prompt to structured JSON.

Open JSON API path

Extract pricing tables

Visible plan cards and pricing pages.

Open pricing-table path

MCP web scraping

Hosted and local MCP setup.

Open MCP path