Amit Kumar's profile on Product Hunt

All activity

2mo ago

Hey Product Hunt! I'm Amit from variA/Bly. The problem Teams shipping AI applications are flying blind. They're iterating on prompts through gut instinct, manual testing, and expensive trial-and-error. It's hard to know: Which prompt variant actually performs better (not just "feels" better)? How to measure quality consistently and scientifically across safety, accuracy, and coherence, etc.? Is...

variA/BlyDelivering production-grade prompt performance for AI Teams

Amit KumarhuntedvariA/Bly

2mo ago

There is no way you can measure your AI drift. variA/Bly helps you evaluate and A/B/n test prompts scientifically, so you catch issues before users complain. Differentiator: → 41-dimensional evaluation -quality scored across multiple dimensions → Statistical A/B testing - confidence intervals, not gut feeling → AI-powered optimization - generates better prompts from data → Prompt Registry - version control and deployment Other tools wait for user complaints. variA/Bly measures continuously.

variA/BlyDelivering production-grade prompt performance for AI Teams

Amit Kumarstarted a discussion

2mo ago

How are you measuring your AI drift?

It's a proven fact that none of the AI systems breaks overnight; They decay. They fade, shift, and degrade quietly. Stanford found GPT-4 accuracy on basic reasoning tasks dropped 97.6% -> 2.4% between March and June: https://arxiv.org/abs/2307.09009 variA/Bly has evaluated across 10+ workflows, and the same pattern appears: Accuracy drifts (almost 15–40%), prompts regress, RAG relevance drops,...

Amit Kumarleft a comment

2mo ago

Hey PH! Built variA/Bly because I was tired of shipping prompts based on gut feeling and hoping they worked. Most teams find out their AI is broken from angry users. We wanted a way to know *before* that happens. variA/Bly gives you: → 41-dimensional scientific evaluation. → Statistical A/B testing. → Helps measure your AI drift. → AI-powered prompt optimization. → Version control and...

variA/BlyDelivering production-grade prompt performance for AI Teams