About

I’m Pavan Kumar Rapata, Lead GTM at oneinfer.ai. I help AI teams adopt unified GPU inference without managing infrastructure. Coming from an Oracle DBA background, I understand real production challenges, reliability, performance, and cost. My focus is helping teams scale models to production in a simple, predictable way.

Badges

Tastemaker
Tastemaker
Tastemaker 10
Tastemaker 10
Tastemaker 5
Tastemaker 5
Gone streaking 10
Gone streaking 10
View all badges

Maker History

  • oneinfer.ai
    oneinfer.aiUnified Inference Stack with multi cloud GPU orchestration
    Dec 2025
  • 🎉
    Joined Product HuntNovember 26th, 2025

Forums

Feature Update: Toggling to Self Hosted Inference for your existing agentic harnesses/copilots

Feature Update: You can now route your coding copilot or any agentic harness' traffic to self hosted models either in local or cloud in oneinfer-edge (fully open-sourced)
Currently we have given support to OpenCode, KiloCode, OpenClaw, and Codex, most of you already have used at some point.
No plugin. No config file. No IDE restart. You click ONEINFER, a local proxy intercepts your copilot's requests, translates the format, routes to your self hosted model, and returns the response.
Your IDE doesn't know anything changed.
The proxy handles the ugly parts, model name rewriting, response format translation, streaming, so you don't have to spend an afternoon debugging why Codex expects an OpenAI messages format and your local model returns something else.
Switch back to original models in one click, mid-session, no restart. For when you actually need it.
This is just the start. Support for more agentic harnesses and copilots is already in the works, we're expanding the list based on what the community actually uses. So please voice out what you need in the github issues.
oneinfer-edge is the proxy, the hardware compatibility scanner, the inference routing, it's all in the repo. We'd rather you read the code than take our word for it.
GitHub repo - Link in the comments. PS: dont want this to get suppressed
Give it a try, if it saves your time and add value in anyway, please do consider giving us a star.
Which copilot should we add next? Please feel free to comment and let us know your preferences.

Today, we just shipped AI inference routing in oneinfer-edge.

Not a load balancer. Not a failover script. A routing layer that understands the difference between your local machine, your cloud GPU, and your third party API and moves each request to the right one automatically.

Three directions. One routing layer.

Cloud hosting is now live in OneInfer Edge.

Same app you use to run models locally. You pick a model from the catalog, select a GPU, and deploy. Edge checks if the model fits the hardware before anything spins up, same way it does for local. No failed deployments, no wasted GPU hours.

Your cloud deployments sit next to your local ones in the same interface. One place to track costs, monitor usage, and swap GPUs.

View more