The 500MB Club Challenge - 2 CPUs. 500MB RAM. One real Pi. Prove your language.

Most backend benchmarks run on generous cloud boxes nobody deploys to. This one doesn't: a write-heavy telemetry service (GPS/battery/accel data) has to run — LB, 3 API replicas, storage — inside 2 CPUs and 500MB RAM total, on a real Raspberry Pi. Open, language-agnostic: fork it, implement the API, ship a Docker image, submit a PR. Scored on efficiency, capacity, tail latency, resilience, stability. Sponsored by Ardan Labs, JetBrains, GopherCon Latam. Deadline: July 26.

Add a comment

Replies

Best
Hey! I built this because every backend benchmark I could find ran on hardware nobody actually ships services to — 4, 8, 16 vCPUs, gigabytes of RAM. That's not what edge or cost-constrained deployments look like, and it hides exactly the stuff I care about as a backend engineer: how a runtime behaves when it doesn't have room to hide its mistakes. So I put the whole stack — load balancer, 3 API replicas, storage — on a single Raspberry Pi with a hard 2 CPU / 500MB ceiling, and picked a domain that's realistically annoying: ingesting GPS/battery/accelerometer data from thousands of couriers in real time, the kind of write-heavy, tail-latency-sensitive workload that sits behind any delivery or mobility app's live map. Anyone can submit in any language — Go, Rust, Zig, Node, Python, Java, whatever you want to prove a point with. The repo has the OpenAPI contract, the load scripts, and the full scoring breakdown. Huge thanks to Ardan Labs, JetBrains, and GopherCon Latam for sponsoring prizes. Deadline is July 26 — if you don't have time to build something, an upvote or a share to someone who'd enjoy this goes a long way. Happy to answer anything about the scoring, the hardware setup, or why I picked this domain.

love how it forces you to actually respect the constraints of a pi instead of pretending your laptop is prod. the scoring on tail latency and stability is a nice touch too.

running the full stack on a real pi is the kind of constraint that actually exposes the sloppy stuff in your code. love that it's open and language agnostic, makes it feel like a real community challenge rather than a marketing stunt.

finally a benchmark that actually reflects what production looks like on tiny boxes. the rpi constraint makes it way more honest than synthetic cloud tests, and i love that the scoring includes resilience not just raw speed.

Genuinely curious how scoring works when language runtimes vary so wildly. Like would a Rust implementation get penalized because it leaves headroom that a Go or Node setup just can't, or is the scoring pure black-box against the four metrics?

 Rust should score higher than Go on efficiency and tail latency, because Rust runtimes genuinely can do that on this workload. Nothing in the scoring penalizes Rust for leaving headroom. The curves saturate at 1.0, you can't earn more than 1.0 in a metric by being 10x better than the best point. That saturation is deliberate: without it, a submission that's absurdly fast on one trivial metric would dominate the weighted average and mask being mediocre elsewhere. Winning requires being good across multiple dimensions.

You can get how scoring works .