Wade Allen

GitHub - Route every prompt to the cheapest model that can handle it

by
Stop overpaying for AI APIs. This pydantic-ai + FastAPI template classifies each prompt by complexity and routes it to the cheapest model that can handle it — Claude for hard tasks, Groq for simple ones. Real-time cost tracking included. Works with Claude, GPT-4o, GPT-4o-mini, and Groq out of the box. Drop into any existing project in under 10 minutes. Full source, test suite, and setup walkthrough included.

Add a comment

Replies

Best
Wade Allen
Maker
📌
hey built this after watching our API costs climb month over month. the core insight is that complexity classification with structured pydantic outputs is the missing piece most routers skip. once you have a reliable category the routing is trivial. happy to answer questions about the implementation or help anyone integrate it into their stack.