Launching today
Sipp
Run AI in the browser 3x faster. Zero dependencies.
6 followers
Run AI in the browser 3x faster. Zero dependencies.
6 followers
Sipp is a fast open-source AI inference library for the web. Run models directly in the browser up to 3x faster with zero dependencies. Start local, then seamlessly scale to desktop, servers, or bare-metal using a single, unified API for both edge and cloud endpoints.




Hi PH! 👋 I’m Jeremy, founder of Noumena Labs // Sipp :)
We built Sipp to be a simple, unified API for running AI models locally, through providers, or via self-hosted gateways. It’s one zero-dependency library that enables fast in-browser inference, reaching up to 3x the tok/s of popular alternatives in our benchmarks.
Our goal with Sipp is to make embedded and local AI more practical without sacrificing the utility of running larger models through provider APIs. Local AI opens up a lot of powerful use cases like continual monitoring, decision-making, chat and help bots, games, and so much more. Our mission is to help developers build things that start to feel possible when tokens are essentially “free.”
Here’s what makes Sipp different today:
🚀 Blazing-Fast WebGPU: Run models right in the browser with no installs and built in model caching support. In our benchmarks, Sipp reached 3x-5x faster time-to-first-token and decode speed compared to other browser runtimes.
🔀 One Unified API: Write your code once. Switch or split traffic seamlessly between local browser execution, cloud gateways, and remote providers without rewriting your application.
🌍 Start Local, Scale Anywhere: While our initial focus is the browser, the same client API is exposed through Node, Rust, Python, bare metal, or your own server infrastructure. We currently support CUDA, Vulkan, and Metal backends, and plan to add more as we flush the library out.
🧃 Zero Dependencies: 100% open-source, type-safe, and built in Rust / C++.
We have a live demo right on our site where you can pick a model and watch it run 100% on-device in your browser. We also provide a benchmarking tool, so you can run your own tests and directly compare results.
I’ll be hanging out here all day! Happy to go deep on the technical details, implementation, or code. And if you have specific use cases you’d like to explore, I’d love to hear about those too.