What's the biggest API cost difference you've found between LLMs?
Hey guys,
I built Test AI Models because I kept seeing wild cost variations for similar quality outputs on ChatGPT, Grok, DeepSeek and others.
Biggest one I've found so far for customer support case (images bellow):
ChatGPT: $6.654 cost for 1 mil queries
Claude: $4.947 cost for 1 mil queries
DeepSeek: $79 cost for 1 mil queries
Grok: $245 cost for 1 mil queries
In my mind they are similar quality outputs. But 84x more expensive!
At 1mil queries you can save $6.575 just by choosing the right model.
What's the biggest savings you've discovered?
Testing on Test AI Models or elsewhere - curious what cost gaps people are finding in the wild.
Bonus: Did you actually switch models after discovering the difference, or stick with your original choice for other reasons?
You can test your case here: www.testaimodels.com




Replies