Launching today

Edgee
The AI Gateway that TL;DR tokens
97 followers
The AI Gateway that TL;DR tokens
97 followers
Edgee compresses prompts before they reach LLM providers and reduces token costs by up to 50%. Same code, fewer tokens, lower bills.





Free Options
Launch Team / Built With




Typeform
As an indiehacker, I am always afraid of receiving an expensive bill because my AI feature suddenly saw a lot of usage. Anything that can help reduce costs and give me insights into what's going on, is welcome.
It's no brainer to use it from day 1, and see value right away.
Congrats @sachamorard team for building this💪
Thanks a lot @picsoung for the support 🙌
And totally agree! That "unexpected AI bill" fear is real, especially for indie hackers and small teams where one spike can ruin the month 😅
That's exactly why we built Edgee: so you can get cost visibility + optimizations (like token compression) from day one, before things get out of control.
Really appreciate you hunting and sharing this. Excited to hear what you build with it! 🚀
@sachamorard @picsoung We've heard this from pretty much every CTO and CEO we've talked to in Europe and the US. The end-of-month bill can be a real shock! 💸
Plezi
Congrats on the launch!
We're stuck on how to attribute LLM costs back to specific features. Does Edgee tag requests so we can track cost per feature?
Hello @benoit_collet, thanks for the interest !
Good question to ask, it is quite a pain we've experienced where cost was only analyzable by API key which could be painful as you might not want to have 50 different keys just for the purpose of cost categorization.
We've created the "tags" feature which allow you (via API headers or via our SDKs) to automatically define categories. Tags will be visible in your analytics dashboard to allow you to understand exactly where you are spending the most !
You can learn more on our documentation : https://www.edgee.ai/docs/integrations/langchain#tags
The documentation I've sent is part of our Langchain SDK
This doc enters more in depth into what tags really are
Inyo
As a product guy in the agentic platform space, I’m definitely going to keep a close eye on this one. Good luck with the launch!
@yannick_mthy The agentic space is exactly where we’re seeing things get interesting (and complex) fast, especially with growing context sizes, tool calls, and multi-model orchestration.
Would love to hear how you're currently handling cost + routing on the agent side. Always keen to learn from teams building in this space. Thx
PhotoRoom
Congrats on the launch! will closely follow as the topic is complex and moves fast!
@olivier_lemarie1 Thank you ! Indeed, a very exciting and challenging topic and so many things to explore and improve :D We'll soon be having a series a blog posts going through all the details and the research around compression, so stay tuned !
Hey Product Hunt 👋
I’m Sacha, co-founder of Edgee. Thanks for checking us out!
We built Edgee because we kept seeing the same thing everywhere:
AI cost is going crazy!!!
LLMs are easy to try, but once you ship them in production, costs explode and reliability becomes a mess.
Most teams start with direct calls to OpenAI or Anthropic… or simply using a coding assistant... then quickly end up dealing with:
Unpredictable token spend
Multiple provider APIs
Outages / rate limits
Security & privacy constraints
And no real observability across teams
Edgee is an AI Gateway built to reduce LLM costs and simplify production inference.
It gives you a single OpenAI-compatible API across providers, plus a layer of intelligence around inference:
✅ Token compression to remove redundant tokens and cut costs, with no semantic loss
✅ Routing & fallbacks across providers
✅ Observability + cost tracking you can trust
✅ Privacy & security controls (ZDR, BYOK...)
✅ Support for public + private models,
✅ & Edge Tools 🚀
We're launching early and working closely with a small group of design partners, so feedback (even brutal feedback 😅) would mean a lot.
Happy to answer any questions here, and I’d love to hear how you’re handling LLM infra in production today!
Sacha
Batch
@sachamorard Token costs are definitely becoming a real problem once prompts get large (RAG, tools, agents…).
Curious how you handle compression without breaking output quality, especially for structured outputs?
@sachamorard @virtualgoodz Yeah alignement is a big issue when doing any prompt transformation !
In general, tracking performance across a mix of semantic preservation metrics like bert, cosine, rouge making sure that they don't degrade below a certain threshold is good proxy.
For structured output, things are trickier, as the compression shouldn't be "generative", in the sense of re-expressing with other tokens, so it's more deterministic through a more compact re-encoding of the structure, through crushing, factorizing repetitions and so on !
Glad to discuss this further if needs be :D
We're experimenting with cheaper models to control costs, but quality suffers.
Can Edgee help us stay on expensive models but reduce token usage instead?