Capability proof, not séance smoke

What Haunt can extract.

Haunt is strongest when you send a public URL plus a plain-English prompt and want structured JSON back. The current production benchmark showed 88/100 structured extraction cases passing after hardening.

That number is useful because it has edges. Haunt is good at normal public web pages, docs, APIs, product pages, metadata, GitHub repositories, and dedicated Reddit/HN-style public data paths. Haunt is not a magic login-wall, CAPTCHA, X/Twitter, Facebook, Instagram, or LinkedIn people-scraping machine.

Benchmark
88/100

Structured extraction pass rate from the current 100-case production benchmark.

Latency
2.8s

Average latency in the benchmark. Heavy pages can still take longer.

P95
12.4s

Observed P95 latency. This is an API, not a fake instant oracle.

Green lanes

Good fit

Company websites

Extract names, descriptions, positioning, pricing links, product text, metadata, and public business facts.

Good fit

Docs and API pages

Turn public docs, API pages, OpenAPI JSON, changelogs, and reference pages into structured JSON.

Good fit

GitHub repositories

GitHub repositories and public metadata are a strong fit through dedicated API-backed routes and normal extraction.

Good fit

Product and pricing pages

Extract public plan names, prices, product lists, descriptions, and table-shaped content when present in the page.

Good fit

JSON, XML, OpenAPI

Structured source formats are among Haunt's strongest benchmark categories.

Good fit

Reddit and HN public routes

Use dedicated routes where public data paths are stable instead of forcing a generic browser scrape.

Yellow lanes

TargetWhat worksBoundary
LinkedIn company pagesPublic metadata, title, description, company shell, sometimes website and employee signals.No people scraping, comments, members, logged-in views, or post history promises.
Instagram business profilesOpenGraph/profile metadata when exposed publicly.Metadata only. Not followers, comments, private media, or logged-in content.
Facebook business pagesPartial public page-card metadata when visible.Not a reliable social feed scraper.
Heavy JavaScript appsSometimes works through browser rendering.Timeouts and thin shells are possible.

Red lanes

TargetVerdictReason
X/TwitterUnsupported for generic extractionHeavy anti-bot/login-wall behavior. Competitors often special-case this with paid data APIs.
CAPTCHA and human verificationClean failureHaunt is CAPTCHA-aware and does not bypass human-verification challenges.
Private/login-only pagesPro/Scale authorised extraction onlyOnly use sessions, cookies, or headers you are allowed to provide.
Personal profiles, comments, groups, followersNot a product promiseRisky, unreliable, and not Haunt's wedge.

Use the right route

For ordinary pages, use /v1/extract. For cheap public intelligence, prefer dedicated routes when available:

GET /v1/github/repo?owner=browserbase&repo=stagehand
GET /v1/hackernews/item/40310896
POST /v1/company/enrich
POST /v1/reddit
POST /v1/reddit/comments

Honest limit: Haunt should be sold as structured extraction from public and authorised pages, not broad social scraping. That honesty is a feature, not a confession. For agent builders, use the focused AI agent web extraction path as the shortest route to demo, signup, and first call.

Ready to try it?

Start with the fixed demo response, then get a free key and make one safe first extraction.