LLMProxy - LLMProxy

by•5mo ago

Seamlessly route requests to your LLM backends—whether you're using stream=false for standard JSON responses or stream=true for real-time token streaming via Server-Sent Events (SSE). LLMProxy handles both modes out of the box, with zero buffering on streams, intelligent load balancing, and OpenAI-compatible API routing. - aiyuekuang/LLMProxy

Replies

Best

Maker

📌

High-performance reverse proxy for LLM inference services — Like nginx for web servers, LLMProxy for LLM inference engines.

Report

5mo ago