All activity
AdaptGauge detects when adding few-shot examples degrades LLM performance instead of improving it.
Testing 8 models across 4 tasks revealed three failure patterns:
⢠Peak regression ā 64% at 4-shot, crashed to 33% at 8-shot
⢠Ranking reversal ā best zero-shot model dropped to third with examples
⢠Selection collapse ā TF-IDF examples broke a model from 50%+ to 35%
Tracks learning curves, auto-detects collapse, classifies patterns, and compares example selection methods.
Demo results included.

AdaptGaugeDetect when few-shot examples make your LLM worse
Shuntaro Okumaleft a comment
Hi Product Hunt! š I'm Shuntaro, and I built AdaptGauge after discovering something counterintuitive: giving LLMs more few-shot examples can make them worse. I call this "few-shot collapse" ā and it's backed by multiple independent research papers from 2025. But until now, there was no tool to detect it automatically before it hits production. AdaptGauge is open source (MIT) and includes...

AdaptGaugeDetect when few-shot examples make your LLM worse
