Launched this week

RunInfra
Describe the AI model you need and get an optimized AI
178 followers
Describe the AI model you need and get an optimized AI
178 followers
Tell RunInfra what you need and it builds the production API. No dashboards. No config. Describe any open source model or full app in plain language. We optimize it for real: benchmark GPUs, quantize the model, generate custom CUDA kernels with our Forge agent. It runs faster and cheaper than standard hosting. Build voice (speech → AI → speech), doc search, vision, or model routing, all in one chat. Pay per million tokens. Scale to zero. Run managed or on your own GPUs.









Tried it with a small vision model this morning and the speed jump over my usual setup was noticeable right away, plus the per-token pricing is way easier to stomach than the GPU bills I was getting before.
Spent a weekend chatting with RunInfra to spin up a voice-to-text pipeline and the custom CUDA kernel step actually beat the latency i was getting on my old setup. Pricing by the million tokens with scale to zero is a nice fit for the random bursts of traffic i get from indie clients.
Vivaldi
Tried, hit a few errors during planning, in the end, it deployed something, but it hung on a simple "wazzup" prompt with no recovery. Nice UI though
Oh, and there is no account removal action available.
RightNow AI
I think the natural language approach makes this platform stand out. i have always preferred explaining what I want instead of navigating multiple dashboards and configurations screen.
RightNow AI
This looks cool! I have started playing around with inference optimisations! Wondering is there a way to learn and contribute at the same thime.
Nicee