LLMProxy
Start new thread
ztao

LLMProxy - LLMProxy

by
Seamlessly route requests to your LLM backends—whether you're using stream=false for standard JSON responses or stream=true for real-time token streaming via Server-Sent Events (SSE). LLMProxy handles both modes out of the box, with zero buffering on streams, intelligent load balancing, and OpenAI-compatible API routing. - aiyuekuang/LLMProxy

Add a comment

Replies

Best
ztao
Maker
📌
High-performance reverse proxy for LLM inference services — Like nginx for web servers, LLMProxy for LLM inference engines.