AI search, open sourced - Free datasets on how AI search picks and cites brands

Trakkr Data opens up the telemetry behind AI search as live, explorable datasets, free for anyone to dig into. Eight datasets: the AI 500 (the brands AI recommends most), the sources AI cites, what AI crawlers actually fetch, how the 8 models compare, how queries get rewritten, and web adoption. Updated daily, with an API. We track ChatGPT, Gemini, Perplexity, Claude, Grok and more. They agree on which brand to recommend only about 43% of the time.

Hey Product Hunt 👋 I run Trakkr, where we track how brands show up across ChatGPT, Gemini, Perplexity, Claude and the rest. Along the way we ended up sitting on data that's hard to find anywhere else: which brands AI recommends, which sources it cites, and what the crawlers actually do when they reach a site. So we opened it up. Trakkr Data is all of that telemetry as free, explorable datasets. Updated daily, with an API. A few things in there that surprised us: - The 8 models agree on which brand to recommend only about 43% of the time. Fully unanimous answers are basically a rounding error at 4%. - Citations are far more concentrated than normal search. A small set of domains does most of the heavy lifting. - AI crawlers keep a daily rhythm. GPTBot peaks in the early afternoon and goes quiet around 2am UTC. It started life as 'the AI 500', a ranking of the brands AI recommends most. It is now eight datasets covering rankings, citations, crawlers, models, queries and web adoption. It's free, and it stays free. Use it for research, for posts, for whatever you like. If you find something interesting in there, I'd love to see it. Happy to answer anything about the data or how we collect it.

AI search, open sourced - Free datasets on how AI search picks and cites brands

Replies