John Alvarez

StreamKit - Build and run Live video, speech-to-text, voice agent,

by
Real-time media processing engine.Video compositing — live video inputs with text/image overlays using the built-in compositor (PiP, z-ordering, crop/zoom, rotation), with CPU and GPU backends. Live websurfaces — Render any web page WebGL via the Servo browser engine Live transcription — Ingest audio via MoQ, run Whisper or SenseVoice STT Voice agents — TTS-powered bots using Kokoro, Piper, or Matcha Audio mixing, real time translation Content analysis — VAD for speech detection, keyword

Add a comment

Replies

Best
John Alvarez
Hunter
📌
Empower your digital infrastructure with a high-performance media engine designed for total control. Build and execute sophisticated real-time pipelines—from live video compositing to AI-driven voice agents and seamless speech-to-text integration—entirely on your own servers. This self-hosted solution offers a composable, observable framework, allowing you to monitor every stream with precision. Experience the ease of building custom media logic while maintaining complete sovereignty over your data and engine pipelines.