Feature Update: Toggling to Self Hosted Inference for your existing agentic harnesses/copilots

Feature Update: You can now route your coding copilot or any agentic harness' traffic to self hosted models either in local or cloud in oneinfer-edge (fully open-sourced)

Currently we have given support to OpenCode, KiloCode, OpenClaw, and Codex, most of you already have used at some point.

No plugin. No config file. No IDE restart. You click ONEINFER, a local proxy intercepts your copilot's requests, translates the format, routes to your self hosted model, and returns the response.

Your IDE doesn't know anything changed.

The proxy handles the ugly parts, model name rewriting, response format translation, streaming, so you don't have to spend an afternoon debugging why Codex expects an OpenAI messages format and your local model returns something else.

Switch back to original models in one click, mid-session, no restart. For when you actually need it.

This is just the start. Support for more agentic harnesses and copilots is already in the works, we're expanding the list based on what the community actually uses. So please voice out what you need in the github issues.

oneinfer-edge is the proxy, the hardware compatibility scanner, the inference routing, it's all in the repo. We'd rather you read the code than take our word for it.

GitHub repo - Link in the comments. PS: dont want this to get suppressed 😄

Give it a try, if it saves your time and add value in anyway, please do consider giving us a star.

Which copilot should we add next? Please feel free to comment and let us know your preferences.

26 views

Feature Update: Toggling to Self Hosted Inference for your existing agentic harnesses/copilots

Replies