PHBench: the first public benchmark predicting Series A funding from Product Hunt launch signals. We analyzed 67,292 featured launches over 7 years, linked to 528 verified Series A rounds via Crunchbase. Champion model: 4.7x lift over random. Team size × community engagement is the strongest signal; B2B (API, Payments, Fintech) converts at 3x baseline; Rank #1 raises at 2.2x unranked. Dataset, code, and baselines open. Submit at phbench.com and subscribe for weekly high-probability launches.

Free

Launch tags:Venture Capital•Artificial Intelligence•GitHub

Launch Team / Built With

Wispr Flow: Dictation That Works Everywhere — Stop typing. Start speaking. 4x faster.

Stop typing. Start speaking. 4x faster.

Promoted

Vela Partners

Maker

📌

@rajiv_ayyangar, thank you so much for hunting us!

Hey PH Community 👋

We're Yagiz, a Senior Technical Product Manager at Amazon and an independent researcher and Yigit, co-founder and GP of Vela Partners. Today, we're launching PHBench in collaboration with the University of Oxford (Ben Griffin and Rick Chen) and Vela Partners, the leading quant VC.

And yes, the irony of launching a Product Hunt benchmark on Product Hunt is completely intentional 🙂

Here's the origin story. We kept asking a question nobody had answered: Can you predict which Product Hunt launches will raise Series A funding, based solely on what you see on launch day (votes, rank, team size, category, timing)?

So we built PHBench. We collected 67,292 featured PH launches going back to 2019, matched them to Crunchbase funding records, and identified 528 verified Series A raises within 18 months. Seven years of data. Every featured launch.

Three findings I think this community will find interesting:

→ The signals work. Our model is 4.7x better than random. Statistically significant.

→ The strongest predictor isn't votes alone. It's team size × community engagement together. A large coordinated team achieving high traction is more predictive than either signal alone.

→ B2B categories convert at 3x the baseline rate. API, Payments, Fintech. If you launch a developer tool on a Tuesday with a big team and high engagement, that's a strong signal.

We also tested three frontier Gemini models on the same task. The most capable model performed the worst. Better reasoning doesn't help with pure numbers.

The dataset is available on HuggingFace. The leaderboard is live. The code is public. Can you beat our baseline?

The paper is on arXiv and has been submitted to the NeurIPS 2026 Evaluations & Datasets Track.

Would love your feedback — especially from anyone who's launched on PH and gone on to raise Series A. You're in our dataset :)

Report

4d ago

Product Hunt

Given the temporal performance decay you observed across funding regimes, how should users operationalize the score: do you recommend retraining/refreshing on a schedule, calibrating by year/sector, or using it mainly as a relative ranking signal—and why did you choose F0.5 as the primary leaderboard metric for that workflow?

Report

2h ago

Vela Partners

Maker

@curiouskitty On operationalizing the score: We'd recommend treating it as a relative ranking signal rather than a calibrated probability. The model ranks well (13x lift over random in the top 50), but the absolute probabilities shift across market regimes.

For anyone deploying this in practice, we'd suggest re-ranking the current cohort weekly rather than relying on absolute thresholds. Periodic retraining (quarterly, as new Crunchbase labels resolve) would help, and calibrating by sector makes sense given that Fintech/API categories convert at 3x the baseline while consumer categories are well below.

On F₀.₅ as primary metric: In VC deal-flow screening, false positives are more expensive than false negatives. A false positive means an analyst spends time on a company that won't raise (scarce capacity wasted). A false negative means missing a deal, but that's recoverable through other sourcing channels. F₀.₅ weights precision twice as heavily as recall, which matches that asymmetry. AP is reported as a threshold-free complement, but F₀.₅ at an optimized threshold is what we'd actually use in a weekly screening workflow.

Report

1h ago

Interesting, most people assume raw upvotes are the proxy for quality. So the finding about team size × community engagement being a stronger signal than votes alone is genuinely counterintuitive but very curious. Have you looked at whether solo founders who hit high engagement are penalized by this model? Do they show up as a distinct cluster? Would love to see how the signal degrades for truly first-time founders vs. repeat ones. Incredible dataset, congrats on getting years of data cleaned!

Report

3h ago

Vela Partners

Maker

@artstavenka1 Thanks! And yes, the votes finding surprises everyone! Raw upvote count is actually one of our four "noise" features. It has high model importance but near-zero conditional lift. The reason: viral launches with 500+ votes are often consumer products riding a wave of hype that doesn't translate to institutional funding. The strongest signal is votes combined with daily rank. A #1 launch with high engagement raises Series A at 3.5x the baseline but votes without a strong rank is noise. Maker team size is #2 in importance, and maker follower count is #6 but carries a higher lift (2.4x vs 1.2x for team size alone). Suggesting that who's on the team matters more than how many.

On solo founders: we haven't done the cluster analysis you're describing, but the data is suggestive. Solo founders (maker_count = 1) underperform teams of 2-3, with a modest 1.2x lift for teams vs. baseline. But the bigger signal is follower count: a solo founder with a large following performs fine; a solo founder with no following is where the model gets skeptical.

We don't currently distinguish first-time vs repeat founders. That's a great feature idea, but maker IDs are redacted in the PH API for privacy, so it's not something participants can compute today. It would require a partnership with Product Hunt to access that signal. @rajiv_ayyangar what do you think :)?

If you're curious about digging in, would love to see you submit a model:) You can get the full dataset from here: https://huggingface.co/datasets/ihlamury/phbench

Report

37m ago

Are you using only launch day signals, or do you include post launch traction like follows and comments over the first week?

Report

7h ago

Vela Partners

Maker

@karimbenkeroum The core signals are captured on launch day (votes, comments, daily rank, maker profiles, topic tags).

One caveat: maker follower counts were scraped in 2026, not at launch time, so for older launches, they reflect post-funding growth. It's a limitation we document in the paper.

Adding richer post-launch features like 7-day comment growth or follow-on engagement would be a great extension. We think there's a lot of untapped signal there.

Full details on the feature set are in Section 5 of the paper: arxiv.org/abs/2605.02974

Report

6h ago

Vela Partners

Maker

Been quietly working on this with Yagiz, Yigit and Rick for a while.

While I mostly focus on using founder profiles to predict raises, PHBench tries the same prediction but from the product side. A similar question but from the other side.

Have a go at the leaderboard if you fancy; the data's on HuggingFace.

Report

6h ago

Vela Partners

Maker

@bengriffin3 thank you for your valuable contributions. Excited to incorporate ProductHunt into founder prediction pipeline.

Report

1h ago

Vela Partners

Maker

Really excited to bring PHBench to you guys! By extending the short-term productivity signals on Product Hunt to predict long-term funding materialization, we help to identify outlier products that are truly valuable in the VC environment. We think it will be greatly beneficial to the Product Hunt community.

Come to beat our baseline and get to the top of the leaderboard!

Report

7h ago

Vela Partners

Maker

@rick_chen5 excited to see more predictors in ProductHunt community to join us! :)

Report

1h ago

Vela Partners

Maker

So excited to see this live! This has been a labor of love, collecting data, running +100 experiments, and testing LLMs against good old gradient boosting.

The leaderboard is open. If you can beat us, you're the new champion. Who's in?

Report

7h ago

Vela Partners

Maker

@ihlamury looking forward to seeing some competition soon!!!

Report

1h ago

1 2

Previous Vela Partners Launches

Vela OSInvest in startups 10x more accurately with AI agents

Launched on December 2nd, 2024

Vela TerminalCo-pilot for VCs

Launched on November 30th, 2023

Vela Partners

Leading quant vc

Leading quant vc

PHBench

Previous Vela Partners Launches

Previous Vela Partners Launches

What's great

What's great