
Deep research APIs are a new category. OpenAI, Perplexity, Google, and startups like Parallel are shipping systems that can browse the web, synthesize sources, and return cited answers in a single API call. These tools are powerful. Comparing them is not. The Deep Research API Index is an independent platform to evaluate, compare, and rank these APIs through community-driven blind battles and comprehensive metrics.
Run blind battles between providers. Two random models, same prompt, you vote on which response is better.
Community-driven rankings based on blind battle wins. See which providers actually perform best.
Side-by-side metrics: pricing, latency, context windows, benchmarks, structured output support.
Real outputs from deep research providers, preserved for comparison. Same prompt, different approaches.
One endpoint to query multiple providers. Fallback strategies, budget caps, normalized responses.
Deep dives on verification debt, citation quality, provider updates, and what actually matters.
I'm Vani, a Math + Informatics student at UW, currently a TA for Data Structures & Algorithms (CSE 373), and incoming Instructor for the course in Summer 2026.
I built this because I kept running into the same problem: trying to pick the right deep research API and finding zero serious, neutral comparisons. So I made the resource I wished existed—and turned it into a community-driven arena.
Provider metrics come from official documentation, published benchmarks, and direct API testing. Leaderboard rankings are based entirely on community blind votes—no synthetic benchmarks, just real human preferences. If something is wrong, missing, or outdated, I want to know.
This is an independent project. I'm not affiliated with OpenAI, Perplexity, Google, Parallel, or any other provider listed here. I don't take money from providers.
Building with these tools, notice an error, or want to debate evaluation criteria?