
Edgee
The AI Gateway that TL;DR tokens
221 followers
The AI Gateway that TL;DR tokens
221 followers
Edgee compresses prompts before they reach LLM providers and reduces token costs by up to 50%. Same code, fewer tokens, lower bills.





Free Options
Launch Team / Built With




token compression at the gateway level is a smart approach. i've been watching my AI API costs climb across multiple projects and this is exactly the kind of infra that makes shipping AI features viable without stressing about the bill
This would be game-changing for our margins. Does the compression work for both prompts and completions?
@hajar_lamjadab2 yes it is! And it's even more efficient when the context window becomes larger and larger.
Cloudthread
Cool idea! Do you get transparency into how prompt was trimmed/manipulated so you can ensure nothing was missed?
@daniele_packard We have information that allows us to understand how our model performs, yes. However, we do not keep the original prompt for obvious privacy reasons. To control the compressed prompt, we perform a similarity analysis by calculating several metrics (rouge, bert, cosine...). And we allow our users to define a threshold that guarantees semantic similarity.
Connectiviteam
Congrats!
@stanmassueras an honour to have your support. At @Edgee, we loooove @ElevenLabs 💪
@stanmassueras Thank you! We really appreciate the support 🙏
If you end up giving Edgee a try, we’d love to hear your feedback.
Congrats on the launch !
LLM's costs are going crazy here, I definitetly give it a try
You'll be welcome @angezanetti . We decided to build Edgee after talking with 50+ CTOs who started to struggle with token costs. Really exciting challenge, the team is sooo excited!
nao
Hey, this is interesting! I was wondering if the prompt optimisations that you're doing are deterministic, as the first layer of cost improvement is caching we having a long conversation with LLM you need to cache, so the prompt compaction need to be deterministic and stable whatever happens.
Second point how do handle different model providers API interfaces? Do you support SSE? Did you reimplemented your own layer between Edgee SDK and LLM providers? There are so many edge cases with each provider when it comes to streaming + tools + reasoning tokens, etc.
Love this! Congrats @sachamorard - Great onboarding XP and managed to get going in <5' we will do ❤️. Curious whether and how we can control the compression level and adjust based on endpoints or use case as I imagine there's a quality trade-off?
@sachamorard Super clear. Thanks!