Amit Kumar

Amit Kumar

I'm the Tech Builder at variA/Bly.
All activity
There is no way you can measure your AI drift. variA/Bly helps you evaluate and A/B/n test prompts scientifically, so you catch issues before users complain. Differentiator: → 41-dimensional evaluation -quality scored across multiple dimensions → Statistical A/B testing - confidence intervals, not gut feeling → AI-powered optimization - generates better prompts from data → Prompt Registry - version control and deployment Other tools wait for user complaints. variA/Bly measures continuously.
variA/Bly
variA/BlyDelivering production-grade prompt performance for AI Teams
Amit Kumarstarted a discussion

How are you measuring your AI drift?

It's a proven fact that none of the AI systems breaks overnight; They decay. They fade, shift, and degrade quietly. Stanford found GPT-4 accuracy on basic reasoning tasks dropped 97.6% -> 2.4% between March and June: https://arxiv.org/abs/2307.09009 variA/Bly has evaluated across 10+ workflows, and the same pattern appears: Accuracy drifts (almost 15–40%), prompts regress, RAG relevance drops,...