Hyperion

LLM API gateway, microsecond latency, 250× faster vs LiteLLM

1 follower

LLM API gateway, microsecond latency, 250× faster vs LiteLLM

1 follower

Hyperion is a high-performance LLM gateway with ~5µs overhead (~250× lower than LiteLLM). It handles smart routing, semantic + exact-match caching, rate limiting, budget control, PII redaction, and failover through a single OpenAI-compatible API. Built in Go with Redis, Qdrant, and ClickHouse.

Free

Launch tags:Software Engineering•Developer Tools•Artificial Intelligence

Launch Team / Built With

Viktor.comAn AI coworker that actually does the work

Promoted

Maker

📌

Hey everyone, Hyperion team here.

We built Hyperion after repeatedly running into the same problem: routing, caching, retries, and cost control all living inside application code.

Hyperion pulls that into a single gateway with ~5µs overhead (~250× lower than LiteLLM), so it can sit in the request path without slowing things down.

It includes smart routing, semantic + exact-match caching, rate limiting, budget enforcement, PII redaction, failover, and real-time analytics.

Would love feedback from anyone working on similar LLM infra

Report

4mo ago

Reviews