Launched this week

Cloud World Model

Launched this week

Simulate AWS, GCP & DigitalOcean without paying the bill

288 followers

Simulate AWS, GCP & DigitalOcean without paying the bill

288 followers

Visit website

Simulate AWS, GCP, Azure, OCI & DigitalOcean architectures to predict cost, performance, and resilience without provisioning real resources or paying a cloud bill. Built for learners practicing cloud skills and AI agents training on cloud optimization.

Free

Launch tags:Software Engineering•Developer Tools•Development

Launch Team / Built With

Framer 3.0With Agents, Branching Community and an all-new design

Promoted

The per-step warning will help a human reading the run, but the agent only ever optimizes the scalar reward, so a warning buried in a report won't bend the policy. What worked for us was keeping a separate eval env with the 'free' dimensions like egress and cross-AZ switched back on, and scoring the trained policy only there. If its reward collapses on that env, you've caught a policy that overfit to the sim's blind spot before it ever ships. Same idea as a train/test split, just applied to the reward.

Report

5d ago

Cloud World Model

Maker

@dipankar_sarkar Really appreciate this. We plan to apply a soft penalty to the scalar reward (not just a surface warning), so the agent does feel the cost of leaning on blind spots. But you're right that a fixed penalty can be absorbed,if the gain is large enough. The train/test reward split is the proper solution.

Would love to get your input on what you'd want to configure in that eval env. which dimensions to expose, whether you'd want the costs to match real provider rates or be parameterizable, and whether pass/fail should be a hard threshold or a relative drop from training reward?

Report

5d ago

This hits a real pain point—our staging AWS bill quietly hit $400/month last quarter because someone left a NAT Gateway running. Which AWS services are fully simulated vs mocked? Specifically curious about Lambda cold starts, DynamoDB Streams, and S3 event notifications. If those three work accurately, this becomes a no-brainer for our CI pipeline.

Report

5d ago

Cloud World Model

Maker

@jimmy_benhsu Great question, and that NAT Gateway story is exactly why we built this.

Here's the honest breakdown for your three:

Lambda cold starts - fully simulated. The engine injects cold-start latency on ~15% of requests per step, adds it directly to your P50/P95/P99 metrics, and can cascade into connection pool pressure if your downstream can't absorb the spikes. You can set coldStartLatency per resource to match what you observe in production.

DynamoDB Streams and S3 event notifications - modeled, not deeply simulated. DynamoDB itself is simulated (request-based billing, base latency, saturation behavior), but Streams-specific things like shard throughput, iterator age lag, or event delivery delays aren't broken out as independent simulation axes yet. Same for S3 notifications - S3 contributes storage cost and latency to the request path, but we don't currently model notification fan-out failure or delivery timing.

For your CI use case: where we're most useful today is catching resource saturation + cost surprises (the NAT Gateway scenario, overprovisioned RDS, Lambda fleet cold-start cascades) before they hit staging. The event-driven plumbing between services (Streams → Lambda triggers, S3 → SQS → worker) is on the roadmap, happy to share more detail on where that lands if it's a blocker for you.

I also need to plan to create a what's modeled thus far page with a mechanism to keep it up to date.

Report

5d ago

Cloud World Model

Maker

@jimmy_benhsu FYI - The what we model thus far page is up. Let me know, if you think it needs more details. Thanks!! https://cloudworldmodel.ai/provider-coverage

Report

5d ago

@mathsociety Thanks for the honest breakdown — that distinction between "fully simulated" and "modeled" is really useful for planning.

For our CI pipeline, the sweet spot for Cloud World Model would probably be resource saturation + cost surprise detection (exactly what you highlighted), while we keep a lightweight real-AWS smoke test path for anything that depends on event delivery timing or shard behavior.

Quick question on the hybrid boundary: in your experience, do teams typically run CWM for the bulk of integration tests and then gate merges with a small real-AWS canary — or do they run both in parallel and diff the results? Curious if you've seen a pattern that minimizes the "false confidence from simulation" risk.

Also checked the provider coverage page — clean reference. Would love to see an "accuracy matrix" column there showing which metrics are measured vs extrapolated, so teams can self-select where to trust simulation vs where to fall back to real infra.

Report

4d ago

Cloud World Model

Maker

@jimmy_benhsu On the hybrid boundary - the more common pattern is CWM for bulk + real canary at the gate, not parallel diffing. The diff approach sounds appealing but adds a lot of noise, simulation and real infra rarely produce byte-identical metrics even when both are "correct," so you end up chasing variance instead of actual regressions. The canary works better as a smoke fence: "does the thing actually boot and serve traffic" rather than "do the numbers match."

The false-confidence risk is real, and the mitigation we've seen work is being explicit about what CWM covers vs. what it doesn’t. Your accuracy matrix idea makes sense. The provider coverage page today shows what's simulated, but not how confidently. An "accuracy type" column, measured vs. extrapolated, would let teams self-select where to trust CWM and where to keep real infra in the loop. I’ll work towards getting it added.

Thanks!!

Report

4d ago

The chaos engineering part caught my eye, injecting a DB crash and getting a resilience score back seems really useful for catching weak spots before prod. Curious how close the cost estimates land to a real AWS bill in practice. Congrats on shipping!

Report

6d ago

Cloud World Model

Maker

@i_sanjay_gautam Thank you, Sanjay. Yes, chaos engineering goes along way towards determining what happens when something crashes. We believe we are 95 to 98 percent accurate to the cost estimates of a real AWS bill. We have an accuracy benchmark which describes it here. https://www.cloudworldmodel.ai/accuracy

Report

6d ago

the cost simulation is the part i need most. i blew $400 on an RDS instance i spun up for "testing" and forgot about for 11 days. nobody warned me.

how granular does the cost projection go? if i model a 3-tier app does it tell me i'm about to pay for an over-provisioned NAT gateway, or just give me a total bill estimate?

the value of cost tools breaks for me at the line item level. that's where i actually make decisions.

Report

6d ago

Cloud World Model

Maker

@thenameisarian Hi, this is the scenario we are built for so the $400 RDS instance story hits home. The engine prices every resource in your architecture independently. So a 3-tier app isn't one blended number, it's the sum of its parts, and you can see which part is bleeding money. We are also a simulation engine, intended to simulate most times even before you deploy the architecture. If you change the architecture, simulate again. Thanks for the question.

Report

6d ago

@mathsociety this is the answer. the "simulate again after every architecture change" workflow is what i don't have anywhere else. terraform plan tells me what's changing, not what it'll cost.

added to my list to try this week. one more quick one: does the simulation cover spot instance pricing or just on-demand? the cost-prediction wins i actually need are usually in the gap between "what i provisioned" and "what i'm paying for at 2am during a scale event."

cheers for the thoughtful answer.

Report

6d ago

Cloud World Model

Maker

@thenameisarian pricing is on-demand today, spot/preemptible isn't modeled yet. So I won't pretend it does.

But the second half of what you said is the part we actually nail: the "what am I paying at 2am during a scale event" gap. Cost isn't a static number off your provisioned config, it's recomputed every simulation step as autoscaling adds and removes instances. So when traffic spikes and the fleet scales from 2 to 9 nodes, you watch the cost/hour climb in real time, and Aurora Serverless v2 cost tracks live ACU rather than a flat rate. That's exactly the gap between "what I provisioned" and "what the scale event actually costs me" modeled step by step, just at on-demand rates.

Spot pricing is a fair ask though, and it's the natural next layer. Comments like these is the reason we did Product Hunt to get a sense of what customers want. We’ll add it to roadmap.

Report

6d ago

This is highly relevant for developers trying to architecture and test multi-cloud environments without burning budget early on. How accurate is the simulation when replicating complex networking constraints or IAM policies between AWS and GCP? Great launch!

Report

6d ago

Cloud World Model

Maker

@daniel_adsuar_prieto Thanks Daniel great question. Appreciate, the kind words regarding the launch.

For multi-cloud networking constraints (latency between regions, cross-cloud egress costs, routing behavior), the simulation is quite accurate, we model provider-specific network topologies, zone-aware placement, and inter-cloud traffic costs. That's core to what we do.

IAM policies are a different story. We don't simulate that. We focus on the core infrastructure. IAM policies is a gap worth thinking about to test least-privilege architectures before deploying. It's something worth considering if there is demand for it. Thanks!!

Report

6d ago

@mathsociety thanks for the clarification, Kevin. That makes sense regarding the scope of the simulation. Focusing on the infrastructure core is a solid approach, and the IAM gap is definitely a valid point for future iterations. Keep up the great work!

Report

6d ago

Congrats on the launch! 🚀

Simulating cloud architecture before provisioning real resources is a very useful idea, especially for cost-heavy experiments and failure testing.

I'm curious: how close are the cost and performance predictions to real-world cloud bills after deployment? Do you provide any confidence score or comparison against actual usage data over time?

Report

6d ago

Cloud World Model

Maker

@prashant_patil14 Thanks Prashant, appreciate the kind words. Cost predictions are grounded in published provider pricing (updated when drift is detected against live pricing pages) and validated against benchmarks our accuracy scores sit between 96–98% across AWS, GCP, Azure, OCI, and DigitalOcean.

We don't currently ingest actual usage data post-deployment for comparison, so there's no feedback loop that tightens predictions over time from real bills. If we had more resources, we could also run our own ML cloud tests and add to our own data to get the accuracy scores even higher. We don't do that today.

Here's our current accuracy numbers. I think they would need to be externally disproven. https://cloudworldmodel.ai/accuracy

Report

6d ago

Useful angle for teams that want to teach cloud tradeoffs without handing out real cloud accounts. I’d be interested in how close the cost/perf model stays to provider changes over time, since drift is usually where these simulators get hard to trust.

Report

5d ago

Cloud World Model

Maker

@jimmy_lee12 Drift is an important concern. We use varying CI tricks.

Continuous validation: Every code change runs a pricing check in CI that fails the build if our cost constants drift from the reference rates we've sourced from each provider's pricing pages (AWS, GCP, Azure, OCI, DigitalOcean).

Weekly accuracy floor: A scheduled job benchmarks all five providers against real-world reference data every week and fires an alert (via PostHog - Shoutout to @PostHog ) if any provider's overall simulation accuracy drops below 95%. All five are currently well above that - AWS ~97%, GCP ~98%, Azure ~98%, OCI ~97%, DigitalOcean ~98%.

It's a combination of CI tricks and checks that keep things accurate.

Report

5d ago

1 2 3

Forum Threads

p/cloud-world-model

•

3d ago

What We Learned from Launching Cloud World Model on Product Hunt

We finished #5 Product of the Day. Here's what three days of 60+ comments actually taught us.

We launched Cloud World Model on Product Hunt last week. The pitch: simulate AWS, GCP, Azure, OCI, and DigitalOcean infrastructure without provisioning real resources. You describe an architecture compute, databases, load balancers, serverless functions and the simulator models latency curves, CPU saturation, autoscaling behavior, failure propagation, and cost. No cloud bill.

We expected interest from learners. The comments told a different story.

p/cloud-world-model

•

6d ago

Why Did We Include DigitalOcean in Cloud World Model?

Well it's a pretty simple reason or benign reason.

I'm ex-OCI and my former boss is now in charge of DigitalOcean.

p/cloud-world-model

•

6d ago

You Can Even Create Apps That Uses Cloud World Model APIs

Here's a silly app that I created using Vercel called Disaster Day. I planned to submit it to a hackathon but not sure yet.

Here is the basic premise of the game.

Design an AWS architecture on a strict hourly budget, then defend it against a randomized barrage of escalating disasters. The Cloud World Model simulates every outcome. How long can your stack stay up?

View all

@jimmy_benhsu Great question, and that NAT Gateway story is exactly why we built this.

Here's the honest breakdown for your three:

I also need to plan to create a what's modeled thus far page with a mechanism to keep it up to date.