Launched this week

General Compute

Launched this week

AI models that run on an inference cloud optimized for speed

529 followers

AI models that run on an inference cloud optimized for speed

529 followers

Visit website

AI Infrastructure Tools

GPUs are built for training, not inference. General Compute is an inference cloud running on ASICs — purpose-built alternatives to Nvidia silicon designed specifically for inference. We deliver 5x faster responses and higher per-user throughput for latency-sensitive workloads like coding and voice agents. Our OpenAI-compatible API means you swap your base URL, keep your existing workflows, and run real-time AI on infrastructure built for the job.

Free Options

Launch tags:API•Software Engineering•Alpha

Launch Team

Anam — The face for your agent. Interactive avatars via API.

The face for your agent. Interactive avatars via API.

Promoted

Kipps AI

Big congrats on the launch!

How about the time to set up? Is it able to run on CPU?

Report

1d ago

Neutron

Maker

@nishit_chittora It's just an API! So you can run an HTTP request from a CPU :)

Report

1d ago

Congrats on the launch. Your onboarding workflow is great. I missed clear models and pricing information upfront, and when I got onboarded I saw that you offer three models at somewhat premium pricing.

This leads me to my question: what is your value prop beyond latency? Because if you're competing on price, OpenRouter is still going to get you.

Model

Context

Input / 1M

Output / 1M

DeepSeek V3.2

Reasoning

deepseek-v3.2

32k

$3.00

$4.50

DeepSeek V3.1

Reasoning

deepseek-v3.1

128k

$3.00

$4.50

MiniMax M2.7

minimax-m2.7

160k

$0.40

$2.34

Report

1d ago

*latency and speed are apparently the value props?

Report

1d ago

Neutron

Maker

@robert_douglass These are what they are calling "premium tokens" - faster and more expensive. You don't need to compete on price if you can be much faster (proven by Cerebras). But now that you mention it we are lowering prices to be in line with open router. Should be updated later today!

Same price just 5x faster!

Report

1d ago

Congrats on the launch! Efficiently stashing heterogenous ASICs behind a homogeneous API is a challenging and exciting endeavor :) Especially curious about the technologies powering elastic scaling with request volume and bursts. Would love to see a characterization of that in maybe a future blog post as it would certainly be useful to many service designers!

Report

1d ago

Neutron

Maker

@tejas_harith We'll do a blog post about it at some point, and we're always hiring smart people that find these kinds of endeavors interesting!

Report

1d ago

The sequential call problem for agents is real. Latency adds up quickly when you chain together 50 or more LLM calls. I'm curious how the ASIC stack deals with variable prompt lengths, as that's often where GPU inference becomes unpredictable as well.

Report

19h ago

OpenClaw can sign itself up? That's wild. Finally someone building for a world where agents run themselves. 👏

Report

1d ago

Neutron

Maker

@timothy_oluwatipin2 Thank you!

Report

1d ago

PS Agent sign up is brilliant! I'm going to study that.

Report

1d ago

Neutron

Maker

@robert_douglass Appreciate it :)

Report

1d ago

ZeroTrusted.ai

Studied full stack development but never really got deep into the infrastructure side of things. Always assumed GPUs handled everything AI related. The idea that inference needs its own optimised hardware makes sense when you think about it. Congratulations on the launch.

Report

1d ago

Neutron

Maker

@sidraarifali Thank you!

Report

1d ago

1 2 3