Firecrawl Research Index - An index for agents pushing the frontier of AI/ML research
by•
AI/ML research moves fast, and the work that matters is split between new papers and the code that implements them. Most search providers omit or misrank key papers, leaving you to review sources by hand without ever being sure you've caught everything.
So we built an index for it. Firecrawl's index includes all 3M+ arXiv papers, as well as GitHub artifacts from top research repos, refreshed daily so agents always stay current.


Replies
Firecrawl
Hey Product Hunt 👋 Eric, Caleb, and Nick from Firecrawl here. Today we're launching the Firecrawl Research Index, a specialized index for agents pushing the frontier of AI/ML research.
AI/ML research moves fast, and the work that matters is split between new papers and the code that implements them. Most search providers omit or misrank key papers, leaving you to review sources by hand without ever being sure you've caught everything.
So we built an index for it. Firecrawl's index includes all 3M+ arXiv papers, as well as GitHub artifacts from top research repos, refreshed daily so agents always stay current.
On arXivQA, the index has state-of-the-art recall, 18% above the next best provider at similar cost. It also scores 0.750 MRR, meaning the correct paper lands in the top two results. Your agent finds the right papers, right away.
Plus, the index ships with a complete research toolset. Agents can retrieve papers, verify claims against the full text, and pull code for implementation - running the full research loop end-to-end. An agent training a model overnight could pull an optimizer from a recent paper and a stability fix from a related GitHub issue, then test both in its next run.
Firecrawl Research Index is available now in the API via /search/research, CLI, MCP, and SDKs, and plugs into any harness you already run (Codex, Claude Code, or Grok Build).
Try it here: https://docs.firecrawl.dev/features/research
We'd love to see what you build with it.
@ericciarla wow! @Firecrawl is going through the roof!
Firecrawl
@danielsinewe thanks daniel!
The indexing isn’t the hard part, it’s the finding of what papers make earlier papers obsolete.
Or, not only that, it’s the understanding that many papers (even recent ones) are irrelevant. You’ll get agents touting results, only to read the paper and see that the conclusions are based on GPT 4o, which for many purposes has more in common with the Apollo guidance module than current models trained with reinforcement learning.
Okay fine, that’s a bit too hyperbolic, but really: the changes in training paradigms just make a huge corpus of research not very applicable to current LLMs, and that’s a point that models skip right over.
Elentaria
@ericciarla congrats! Will definitely try it
Firecrawl
@khashayar_mansourizadeh1 thank you!!
Build Check
Hey Eric! IT's truly impressive. 3M+ papers is more than enough to make researches that actually matter. Love to see you helping on this and wish you all the best!
Firecrawl
@german_merlo1 Thank you!
This is cool. How do you decide which GitHub repos qualify as top research artifacts?
@dhiraj_patel5 Hey Dhiraj, Richard from Firecrawl here. It's driven by the papers, not by stars or popularity. We index the repos that actually show up in the research we've ingested. The more papers point to a repo, the more we index it.
Getting state-of-the-art recall on arXivQA is legitimately hard. The tricky bit isn't crawling the PDFs. It's parsing structured content from LaTeX source vs. the rendered PDF without losing math notation and figure references. We've spent time on similar extraction challenges when pulling structured data from dense technical documents. What does your indexing pipeline use for equation and table extraction from arXiv source tarballs?
3M+ arXiv papers plus GitHub artifacts all in one index refreshed daily is seriously impressive. The recall benchmark results are pretty convincing too. I'm curious — does the index also cover papers from conferences like NeurIPS or ICML, or is it purely arXiv-based right now?
This is useful. For research agents, the part I’d want surfaced is provenance per downstream step: which paper, which repo or issue, which claim, and what changed in the experiment because of it.
Does /search/research return enough citation structure for an agent to keep that trail, or is that on the harness?
I don't do ML research, I build production agents, but the core pain here is universal: agents quietly acting on stale or misranked sources and nobody noticing until it bites. Pairing each paper with the code that implements it, refreshed daily, is the clever bit. Curious whether you'll extend that same "source + the thing that implements it, kept current" idea beyond arXiv to general docs/APIs, because that's exactly where my agents drift. Nice launch.
Used Firecrawl in a couple of n8n workflows and it's been great — way more reliable than custom scrapers, and the clean markdown output just works. Congrats on the launch!