About

I'm a AI performance enthusiast, I cofounded oneinfer.ai. where we are striving for an optimized infra layer for AI workloads. We are building openbandwidth for leveraging instant optimized infra, my goal is accelerate the AI adoption in a massive scale. I feel, that cost and accuracy plays a major role in adoption, I have started with optimizing cost :)

Badges

Tastemaker
Tastemaker
Tastemaker 5
Tastemaker 5
Gone streaking 10
Gone streaking 10
Gone streaking
Gone streaking
View all badges

Maker History

  • oneinfer.ai
    oneinfer.aiUnified Inference Stack with multi cloud GPU orchestration
    Dec 2025
  • 🎉
    Joined Product HuntOctober 25th, 2025

Forums

New Feature: Connect your locally hosted AI models to coding copilots with a click in oneinfer-edge

We shipped a new feature in oneinfer-edge (fully open source) to connect your locally deployed model to coding copilots like codex, OpenClaw, OpenCode and kilo code etc....

No plugin. No config file. No IDE restart. You click ONEINFER, a local proxy intercepts your copilot's requests, translates the format, routes to your self hosted model, and returns the response.
Your IDE doesn't know anything changed.
The proxy handles the ugly parts, model name rewriting, response format translation, streaming, so you don't have to spend an afternoon debugging why Codex expects an OpenAI messages format and your local model returns something else.
Switch back to original models in one click, mid-session, no restart. For when you actually need it.
This is just the start. Support for more agentic harnesses and copilots is already in the works, we're expanding the list based on what the community actually uses. So please voice out what you need in the github issues.
oneinfer-edge is the proxy, the hardware compatibility scanner, the inference routing, it's all in the repo. We'd rather you read the code than take our word for it.

An intro about oneinfer-edge

We just shipped the first feature for oneinfer-edge and it's open source.
Ever copy a Hugging Face model ID, spend 2 hours setting things up, and then watch it fail because your VRAM was off by a few GB? Yeah. We've all been there.
oneinfer-edge now tells you if your machine can run any Hugging Face model before you deploy.
Paste a model ID, it scans your GPU, VRAM, OS, and serving libraries, gives you a Hardware Ready verdict and full memory breakdown (weights + KV cache + serving overhead).
No surprises at runtime.
Supports Apple Silicon (M1 to M5), NVIDIA (CUDA), AMD (ROCm), and serving libraries including Ollama, llama.cpp, SGLang, TensorRT-LLM, PyTorch and many more coming.
It tells you why something won't work, not just that it won't.
CPU support is something we're actively working through and feedback and contributions on that front are very welcome.
oneinfer-edge is part of the broader oneinfer.ai inference control plane, a platform built for teams shipping multimodal AI products at scale.
oneinfer-edge brings that same infrastructure intelligence to your local machine so self-hosting is a genuine alternative to managed cloud inference, not a debugging exercise.
We built this in the open because self-hosted AI infrastructure should belong to the community that runs it.
Star the repo: https://github.com/oneinfer/onei...
Report issues or request features: https://github.com/oneinfer/onei...
Learn more: https://oneinfer.ai/platform/one...
Drop us a star if this looks useful and PRs are wide open. We're just getting started.

Building Reserved AI Bandwidth - openbandwidth

We are excited to share that we are building Reserved AI Bandwidth - openbandwidth

This effort came from a deep frustration that AI pricing model is broken for heavy usage. For building something meaningful and useful, I was paying 200$ per month for closed source research labs. Also the position no.1 is being competed by various companies, migrating was very difficult.

View more