Edgee

The AI Gateway that TL;DR tokens

221 followers

The AI Gateway that TL;DR tokens

221 followers

Visit website

AI Infrastructure Tools

•

AI Metrics and Evaluation

•

LLM Developer Tools

Edgee compresses prompts before they reach LLM providers and reduces token costs by up to 50%. Same code, fewer tokens, lower bills.

Free Options

Launch tags:Software Engineering•Developer Tools•Artificial Intelligence

Launch Team / Built With

Auth0 — Start building with Auth0 for AI Agents, now generally available.

Start building with Auth0 for AI Agents, now generally available.

Promoted

Congrats on the launch! Will definitely be following this project closely. I've always thought there should be a way to more efficiently provide prompt for LLMs, especially when the latest models consume a lot of them for complex work. Hopefully this will eventually result in less usage rate and higher limits.

Report

8d ago

Maker

@mnzrabusham The further we advance, the more complex models become. Hardware innovations will probably improve the energy efficiency of models (because that's what it's all about), but the smartphone industry has taught us that the more powerful machines are, the more we ask them to perform increasingly complex tasks. So yes, I think that only intermediate systems can help us be a little more frugal.

Report

8d ago

Would like to see benchmarks across different model providers and prompt types. If the compression holds under real production loads, this could become default infra in most LLM stacks.

Report

7d ago

Maker

@wallerson absolutely! We already publish a benchmark on our website at this address : https://www.edgee.ai/benchmark This clearly show the gap in token consumption between the different AI gateway. But we’re hardly working on a more detailed version.

Report

7d ago

Impressed by the edge-native architecture with 100+ PoPs and the token compression approach.

I noticed Edgee is built with Claude Code. For developers using AI coding agents (Claude Code, Cursor, etc.) that make heavy API calls during development, does Edgee support integration at the agent workflow level? Specifically, can we route AI agent requests through Edgee to compress tool call contexts and reduce token consumption during iterative coding sessions?

Report

7d ago

Maker

Hello @yamamoto7. The short answer is yes! We already have a documentation that explains how to use Edgee with Claude Code. We are actively working on a new version of our token compressor to specifically target Claude Code messages, and the level of gain is gonna be massive! Let’s keep in touch ;)

Report

7d ago

Thanks for sharing! Exciting to hear about the Claude Code-specific token compressor. Looking forward to seeing the gains in iterative coding sessions.

Report

6d ago

The idea is very interesting. But how does it work?

For example, I have a travel AI — essentially a wrapper around ChatGPT and Gemini. Some of the prompts are huge. How would you reduce the number of tokens? Would you compress my prompts? But that could affect quality.

Could you suggest where something can be replaced with free or cheaper tools? But then you would need to know our product no worse than we do… How do you do that?

Report

7d ago

TimeZoneNinja

This looks amazing, @gilles_raymond ! Reducing token costs by 50% is a game changer for anyone building agents for big audience 🤯 Question: How does the compression impact the latency for real-time applications? Congrats on the launch!

Report

7d ago

Maker

@sgiraudie as our architecture is at the edge, there is no sensitive effect on latency.

Report

7d ago

1 2 3 4