OpenCastor - The Universal Runtime for Embodied AI

OpenCastor is a universal runtime for embodied AI — a framework that gives every robot a brain, body, and behavior through one simple YAML config. No matter the model provider, no matter the camera, no matter the motors, OpenCastor handles: • provider abstraction • safety enforcement • hardware drivers • motor control pipelines • graceful interrupts • realtime constraints It’s open-source, Python-based, and designed so that you can start with zero cloud cost and scale only when you need it

Hey Product Hunt! 👋 I'm Craig, and I built OpenCastor to solve a major frustration in embodied AI: most robotics frameworks treat local, self-hosted models as an afterthought rather than a primary feature. Inspired by the flexibility of tools like OpenClaw, I wanted to build an architecture that leaves the door wide open for local models, allowing developers to eliminate recurring API costs and reduce physical latency. OpenCastor is an open-source Python runtime that seamlessly connects your AI models to robot hardware using just a single YAML configuration file. You simply pick your "brain" (local models via Ollama, or cloud APIs like Claude, Gemini, and GPT-4o) and pick your "body" (Raspberry Pi, Jetson, Arduino, or Dynamixel servos). The core of the framework is our "Tiered Brain" architecture, which ensures you only pay for expensive cloud compute when absolutely necessary: • Layer 0 (Reactive): A deterministic, rule-based safety layer that operates in under 1ms (or ~250ms running YOLOv8 object detection on a Hailo-8 NPU). If an obstacle is too close, it stops the robot immediately. • Layer 1 (Fast Brain): The primary perception-action loop that runs local, vision-capable models (like Qwen2.5-VL-7B via HuggingFace or any local Ollama instance) at ~500ms latency for zero cost. • Layer 2 (Planner): Complex cloud models that only fire every ~15 ticks, or when the Fast Brain signals high environmental uncertainty. For developers looking to scale, OpenCastor natively supports multi-robot swarm dynamics. We've incorporated the RCAN specification (rcan.dev) to add a crucial safety layer for robot networking and swarm intelligence. This helps solve a major ongoing challenge in multi-robot systems: ensuring that individual units can communicate and interact safely in unknown environments without developers needing to implement complex safety and networking features a priori. Because the reactive and local layers handle approximately 95% of the operational decisions, you get a fully functional robot brain with virtually zero API costs. The project is fully open-source under the Apache 2.0 license. I'd love to hear your thoughts and feedback! For those of you experimenting with local vision models, what models are you finding work best for spatial reasoning and action planning?

OpenCastor - The Universal Runtime for Embodied AI

Replies