Launched this week

Context Gateway
Make Claude Code faster and cheaper without losing context
248 followers
Make Claude Code faster and cheaper without losing context
248 followers
Context Gateway cuts latency and token spend for Claude Code / Codex / OpenClaw by compressing tool output while preserving important context. Setup takes less than a minute. Quality-of-life features: instant context compaction and setting spend limit in Claude Code.



Context Gateway
Congrats on the launch! Curious how the compression handles tool outputs that contain mixed content, structured data alongside verbose logs, for example. Does it preserve the structured parts reliably while trimming the noise, or is it more of a blunt summarization?
Context Gateway
@joao_seabra Thanks for the question!
Right now we don’t explicitly differentiate between structured and unstructured data and the compression runs across the tool outputs as they are. Even with that simple approach we’re seeing pretty significant gains in accuracy and reduction of cost and latency.
That being said, you’re touching on something we’re actively working on. Our next major update will start treating structured and unstructured parts differently, so we can treat things like JSON/schema fields atomically while being more aggressive with verbose logs.
Expect improvements here soon.
Really smart approach to a problem I hit constantly - agent tool calls returning massive outputs that bloat context and burn tokens. The instant compaction feature is clutch too, waiting 3 min for /compact in Claude Code always kills my flow. Curious how the compression models handle code-heavy outputs vs prose - do you see different compression ratios?
Context Gateway
Hey @emad_ibrahim , thank you! The compression ratio is currently fixed at 0.5 - we'll make it auto-tunable in the future to account for varying "density" of different inputs, but, empirically, we see that it already works fine!
The spend cap and Slack notifications are almost more valuable than the compression itself. Running Claude Code on a large codebase without any spending guardrails is genuinely stressful. You check back after 20 minutes and it's burned through $40 on a rabbit hole.
Is the compression lossy in practice? I've seen context window summaries drop important details (like specific variable names or error messages) that then cause the agent to hallucinate fixes. How do you handle preserving the details that actually matter vs. trimming the boilerplate?
Told
The token compression angle is the right problem to attack — once devs start hitting context limits mid-session, the cognitive cost of managing that manually kills flow. Curious how the compression handles cases where the 'noise' in tool output turns out to be context a later step actually needed — that edge case is where these systems tend to break trust with developers. The Claude Code integration is smart timing given how fast that tool's adoption is moving right now. Would be interested to see how much latency reduction looks like in practice on a typical 30-minute coding session.
BlocPad - Project & Team Workspace
Oh man the instant compaction alone is worth it. I've been hitting /compact in Claude Code and just staring at the screen for like 3 minutes every time my context gets bloated. The spend cap + Slack notifications combo is also super practical, I've definitely had sessions where I looked away for a bit and came back to a surprisingly large bill lol
Context Gateway
Hey @mihir_kanzariya , completely agree! As a matter of fact, we just built what we wanted to use ourselves :))
Chirpz