React Brai - The easiest way to run optmized SLMs on web

Cloud AI is a costly privacy nightmare. react-brai is a lightweight WebGPU React hook that runs optimized SLMs (like Llama-3B) directly on the user's device. By keeping compute 100% local, you get: Absolute Privacy: Data never leaves the browser (GDPR/HIPAA compliant). Zero Server Costs: Let the user's GPU handle inference. Just npm install, add the useLocalAI hook, and drop secure, offline intelligence into your web app!

Hey Product Hunt! 👋 I built react-brai because I was tired of the trade-offs between adding cool AI features to my React apps and dealing with the absolute nightmare of API costs, latency, and enterprise data privacy policies. If you are building tools for healthcare, finance, or just handling sensitive client emails, you can't just pipe that data to OpenAI. So, I wrapped the WebGPU/MLC engine into a clean React hook (useLocalAI). It allows you to download and run massive, optimized SLMs (like Llama 3B) directly in the browser cache. The Elephant in the Room: Yes, it requires a ~1.5GB initial model download to the user's browser cache. It's not for lightweight landing pages. It's designed for serious, secure, dashboard-level web apps where users need offline, private, and free inference after that initial load. I'm currently using it for automated JSON extraction and secure text summarization without spending a dime on server compute. Would love your feedback on the API design or hear what secure components you'd build with local WebGPU! Let me know if you have any questions.

React Brai - The easiest way to run optmized SLMs on web

Replies