
Edgee
The AI Gateway that TL;DR tokens
221 followers
The AI Gateway that TL;DR tokens
221 followers
Edgee compresses prompts before they reach LLM providers and reduces token costs by up to 50%. Same code, fewer tokens, lower bills.





Free Options
Launch Team / Built With




Congrats on the launch! Will definitely be following this project closely. I've always thought there should be a way to more efficiently provide prompt for LLMs, especially when the latest models consume a lot of them for complex work. Hopefully this will eventually result in less usage rate and higher limits.
Would like to see benchmarks across different model providers and prompt types. If the compression holds under real production loads, this could become default infra in most LLM stacks.
Impressed by the edge-native architecture with 100+ PoPs and the token compression approach.
I noticed Edgee is built with Claude Code. For developers using AI coding agents (Claude Code, Cursor, etc.) that make heavy API calls during development, does Edgee support integration at the agent workflow level? Specifically, can we route AI agent requests through Edgee to compress tool call contexts and reduce token consumption during iterative coding sessions?
Thanks for sharing! Exciting to hear about the Claude Code-specific token compressor. Looking forward to seeing the gains in iterative coding sessions.
The idea is very interesting. But how does it work?
For example, I have a travel AI — essentially a wrapper around ChatGPT and Gemini. Some of the prompts are huge. How would you reduce the number of tokens? Would you compress my prompts? But that could affect quality.
Could you suggest where something can be replaced with free or cheaper tools? But then you would need to know our product no worse than we do… How do you do that?
TimeZoneNinja
This looks amazing, @gilles_raymond ! Reducing token costs by 50% is a game changer for anyone building agents for big audience 🤯 Question: How does the compression impact the latency for real-time applications? Congrats on the launch!
@sgiraudie as our architecture is at the edge, there is no sensitive effect on latency.