I created Canvas Cloud AI in 2025 to facilitate hands-on learning in the cloud. Although I did not launch it on Product Hunt, thousands of users have utilized it to generate cloud architectures for learning. Over the past year, while speaking with developers, engineers, and architects about Canvas Cloud AI, I consistently heard two main concerns: one, that cloud services can be expensive, and two, that connecting to cloud providers can create bottlenecks for hands-on learning.
After hearing this feedback repeatedly, I began to explore the possibility of simulating the cloud without relying on connections to cloud providers. This idea eventually led to the creation of Cloud World Model. I have now decided to launch Cloud World Model on Product Hunt, something I did not do with our earlier product, Canvas Cloud AI.
Cloud World Model
Hi everyone! I'm Kevin Brown, one of the makers of Cloud World Model.
Cloud World Model lets you model AWS, GCP, Azure, OCI, and DigitalOcean architectures and instantly see how they behave CPU, error rates, throughput, autoscaling, failure recovery, and cost without provisioning a single real resource.
A few things we're proud of:
A capacity-aware engine that models real per-provider performance profiles
Chaos engineering: inject zone outages, DB crashes, and network partitions, then get a resilience score
A multi-cloud explorer that compares provider combos on cost, latency, and vendor lock-in
A full RL training API so AI agents can learn cloud optimization in a safe, cost-free environment
Beginner mode with plain-English AI explanations and an interactive tutorial
Whether you're learning cloud skills or training agents to optimize infrastructure, I'd like to hear any of the following in the comments?
How do you typically test cloud architecture changes before putting them in production or any environment?
Do you think a mechanism to be able simulate a cloud architecture change would be useful?
Any experiences with cloud cost comparisons?
Cloud World Model
BTW, you can try us headless using any of the popular tools like @Claude Code , @Codex or @Grok Build. The API is fully documented.
Here is even a starter prompt you can try:
@mathsociety Once you've simulated an architecture and you're happy with it, can you export that to Terraform or Pulumi? Right now it sounds like the sim and the actual deploy are two separate worlds. That's where I'd get stuck.
Cloud World Model
@whetlan Thank you. You are the second person asking about Terraform/Pulumi export question. The way I initially viewed it is the simulator is just a pure simulator. The Agent knows more about the customers actual environment then we do. The Agent gives us the architecture, we simulate it and we give the agent a response, the agent can keep simulating with us until it thinks it has the right answer. The Agent would then create the Terraform / Pulumi code for the architecture. However, if enough people think we should provide that feature, then it's something worth looking into.
@mathsociety Answering Q1+Q2 from the small-team side: I deploy bots to Fly and honestly my "test" is deploy-and-pray, no real staging. It bit me last week, a .env got baked into the Docker image and silently overrode my prod secrets, no error thrown, just wrong behavior. A simulator that flagged "this config will shadow your prod env" before deploy would've saved me hours. And the crash-injection prompt is the right instinct, the failures that hurt are the silent ones, not the loud ones.
Cloud World Model
@david_marko ouch. yes, it happens. At present, we simulate pure architecture without the application code. But your instinct about silent failures is spot on and directly maps to what we do: once your service is running, Cloud World Model lets you inject those quiet failure modes. Would love to know if that's useful for your Fly.io setup. We've tried to support many cloud providers not just the big ones.
@mathsociety Great job! Chaos engineering with resilience scoring before touching real infra is a great fit for teams that want to test failure modes without blowing a staging budget. How closely is the per-provider performance model calibrated against real-world AWS/GCP latency and throughput under load — actual benchmark data, or more of a directional approximation aimed at learning?
Strong launch. The RL training API is the interesting edge. If an agent learns an infra optimization in simulation, I’d want the handoff receipt before deploy: resources changed, env/secret assumptions, failure case tested, and rollback path.
Do you expect agents to export a plan into Terraform/Pulumi, or stay inside the simulator?
Cloud World Model
@blah_mad My hope is actually more people use the API than the UI. Today, the RL agent stays inside the simulator. It learns optimization policies scaling thresholds, resource sizing, failure response and you get the episode history, reward trajectory, and the final recommended configuration as structured output. What you don't get yet is an auto-generated Terraform/Pulumi diff you can apply directly. At present the Agent is responsible for taking what its learned from our simulation engine and creating the IaC code. It's also possibly a natural next step for us to do it. Thanks for the comment!!
That makes sense. The episode history + final config is probably the receipt I’d start with. If the agent writes IaC today, I’d keep the review around the diff: source sim run, changed resources, blast radius, rollback.
Do you plan to make that structured output stable enough for other tools to consume?
Cloud World Model
@blah_mad Yes, the API is OpenAPI-specified with a generated TypeScript SDK, so the output is a documented contract, not ad-hoc JSON. Episode history and final config are stable today. Your diff shape (source run, changed resources, blast radius, rollback) maps well onto what's already there. Blast radius isn't a named field yet but the failure propagation data exists. Happy to share the spec if you want to build on top of it.
Yes, worth sharing. The part I’d look for is the run object other tools can trust: input architecture, sim id, failure data, recommendation, and what changed since the last run.
Is that exposed as one resource today, or stitched from a few endpoints?
The RL training API is the part that grabs me - an agent is only as good as the sim it learns in. The capacity-aware engine modeling "real per-provider performance profiles" is where that lives or dies: are those profiles grounded in published benchmarks and vendor specs, or in measured telemetry, and how often do you refresh them? If the sim cost/latency drifts from the actual providers, an agent will happily optimize for the model instead of the cloud, so how do you validate fidelity against a real deployment?
Cloud World Model
@hi_i_am_mimo performance profiles are grounded in published vendor specs and documented benchmarks, not measured telemetry. Fidelity is validated via an accuracy benchmark that runs against all five providers (AWS ~97%, GCP ~98%, Azure ~98%, OCI ~96%, DigitalOcean ~98%) with a hard CI gate. If we had more resources, we could also add to the data doing our own ML testing and get the accuracy even higher than what we are reporting. Also, pricing we check weekly.
The chaos engineering part caught my eye, injecting a DB crash and getting a resilience score back seems really useful for catching weak spots before prod. Curious how close the cost estimates land to a real AWS bill in practice. Congrats on shipping!
Cloud World Model
@i_sanjay_gautam Thank you, Sanjay. Yes, chaos engineering goes along way towards determining what happens when something crashes. We believe we are 95 to 98 percent accurate to the cost estimates of a real AWS bill. We have an accuracy benchmark which describes it here. https://www.cloudworldmodel.ai/accuracy
the cost simulation is the part i need most. i blew $400 on an RDS instance i spun up for "testing" and forgot about for 11 days. nobody warned me.
how granular does the cost projection go? if i model a 3-tier app does it tell me i'm about to pay for an over-provisioned NAT gateway, or just give me a total bill estimate?
the value of cost tools breaks for me at the line item level. that's where i actually make decisions.
Cloud World Model
@thenameisarian Hi, this is the scenario we are built for so the $400 RDS instance story hits home. The engine prices every resource in your architecture independently. So a 3-tier app isn't one blended number, it's the sum of its parts, and you can see which part is bleeding money. We are also a simulation engine, intended to simulate most times even before you deploy the architecture. If you change the architecture, simulate again. Thanks for the question.
@mathsociety this is the answer. the "simulate again after every architecture change" workflow is what i don't have anywhere else. terraform plan tells me what's changing, not what it'll cost.
added to my list to try this week. one more quick one: does the simulation cover spot instance pricing or just on-demand? the cost-prediction wins i actually need are usually in the gap between "what i provisioned" and "what i'm paying for at 2am during a scale event."
cheers for the thoughtful answer.
Cloud World Model
@thenameisarian pricing is on-demand today, spot/preemptible isn't modeled yet. So I won't pretend it does.
But the second half of what you said is the part we actually nail: the "what am I paying at 2am during a scale event" gap. Cost isn't a static number off your provisioned config, it's recomputed every simulation step as autoscaling adds and removes instances. So when traffic spikes and the fleet scales from 2 to 9 nodes, you watch the cost/hour climb in real time, and Aurora Serverless v2 cost tracks live ACU rather than a flat rate. That's exactly the gap between "what I provisioned" and "what the scale event actually costs me" modeled step by step, just at on-demand rates.
Spot pricing is a fair ask though, and it's the natural next layer. Comments like these is the reason we did Product Hunt to get a sense of what customers want. We’ll add it to roadmap.
Cloud World Model
@daniel_adsuar_prieto Thanks Daniel great question. Appreciate, the kind words regarding the launch.
For multi-cloud networking constraints (latency between regions, cross-cloud egress costs, routing behavior), the simulation is quite accurate, we model provider-specific network topologies, zone-aware placement, and inter-cloud traffic costs. That's core to what we do.
IAM policies are a different story. We don't simulate that. We focus on the core infrastructure. IAM policies is a gap worth thinking about to test least-privilege architectures before deploying. It's something worth considering if there is demand for it. Thanks!!
Congrats on the launch! 🚀
Simulating cloud architecture before provisioning real resources is a very useful idea, especially for cost-heavy experiments and failure testing.
I'm curious: how close are the cost and performance predictions to real-world cloud bills after deployment? Do you provide any confidence score or comparison against actual usage data over time?
Cloud World Model
@prashant_patil14 Thanks Prashant, appreciate the kind words. Cost predictions are grounded in published provider pricing (updated when drift is detected against live pricing pages) and validated against benchmarks our accuracy scores sit between 96–98% across AWS, GCP, Azure, OCI, and DigitalOcean.
We don't currently ingest actual usage data post-deployment for comparison, so there's no feedback loop that tightens predictions over time from real bills. If we had more resources, we could also run our own ML cloud tests and add to our own data to get the accuracy scores even higher. We don't do that today.
Here's our current accuracy numbers. I think they would need to be externally disproven. https://cloudworldmodel.ai/accuracy