Launching today
AdaptGauge

AdaptGauge

Detect when few-shot examples make your LLM worse

1 follower

AdaptGauge detects when adding few-shot examples degrades LLM performance instead of improving it. Testing 8 models across 4 tasks revealed three failure patterns: • Peak regression — 64% at 4-shot, crashed to 33% at 8-shot • Ranking reversal — best zero-shot model dropped to third with examples • Selection collapse — TF-IDF examples broke a model from 50%+ to 35% Tracks learning curves, auto-detects collapse, classifies patterns, and compares example selection methods. Demo results included.
AdaptGauge gallery image
AdaptGauge gallery image
AdaptGauge gallery image
AdaptGauge gallery image
AdaptGauge gallery image
AdaptGauge gallery image
Free
Launch Team / Built With