How are you handling LLM API costs in production? Billing alerts? Hard limits? Nothing?

AgencyHandy

•3mo ago

Running agents in production is getting expensive fast — especially when something loops, retries, or a user abuses the system. Curious what others are actually doing:

Relying on provider-side billing alerts?
Hard limits set on the OpenAI/Anthropic dashboard?
Custom solution you built yourself?
Nothing yet and just hoping for the best?

I've been deep in this problem lately — actually built something around it (launching tomorrow on PH). Would love to hear real approaches first though, especially from anyone running multi-tenant SaaS where you need per-user cost control.

114 views

Replies

Best

honestly... once you start running multi-tenant ai products, provider-side billing alerts stop being enough pretty quickly 😅

we’ve seen cases where retries/background tasks quietly push usage way higher than expected... especially when multiple models and async workflows are involved

per-user tracking and internal limits start becoming really important at that point, otherwise it’s very hard to understand where costs are actually coming from in production

curious to see what you’re launching tomorrow 👀

Report

3mo ago

AgencyHandy

@nidaezahraaa I have launched the library already. Hopefully within next week I will launch the product dashboard. You will gonna love it. Can you email me or contact me so that Ic an reach you out later for testing the product.
https://www.producthunt.com/products/baar-core

Report

3mo ago

Built an in-app admin AI monitor dashboard to see how different models were being utilized and by who, then revealing the costs and margin.

Report

3mo ago

AgencyHandy

wow can I see it

Report

3mo ago