
v1.3.0: script import/export, progress bars, and an embarrassing cloning bug
Shipped another update today. Two things people asked for, one thing I should have caught earlier, and one quiet fix. The embarrassing one first: Voice cloning could silently fail. If you hadn't downloaded the Expressive engine model yet, the cloning process would run, appear to finish successfully, even show engine badges on the card. But the cloned voice wouldn't actually work. It looked...
The real cost breakdown of running a faceless YouTube channel
Nobody talks honestly about what faceless YouTube channels actually cost to run. So here's a real breakdown. Monthly costs for a 2-video/week channel: Voiceover (the biggest variable): Hiring a narrator: $50-200 per video (varies wildly by length and quality) Cloud AI TTS (ElevenLabs, Murf, etc.): $22-99/mo depending on character limits Self-recording: $0, but 2-4 hours per video for scripting...

What we shipped in v1.2.1 (Windows GPU + stability fixes)
This one's mostly a Windows release. The main change: if you're on Windows and using the Expressive or Multilingual engine, generation now runs on your GPU rather than your CPU. It's faster. It kicks in automatically with no setup needed. If your GPU doesn't support it for some reason, the app falls back to CPU without any fuss. You'll see a small GPU label in the engine selector when it's...
Prototyping NPC dialogue on a zero budget
I keep watching indie game devs burn time and money on voice acting way too early in development. Here's what actually works when you're prototyping on a budget of zero. Phase 1: Text-only playtesting Start here. Seriously. Put your dialogue in text boxes and watch playtesters read it. You'll cut 30% of your lines before anyone speaks a word. Written dialogue that reads well often sounds...
New in v1.0.11: Pause nodes for precise silence control in scripts
Quick update from the trenches. One thing that kept coming up in early feedback: there was no way to control silence in generated audio. You'd write a dramatic script, generate it, and the timing between lines felt off. No breathing room. No pauses for effect. So we built Pause nodes. How it works: Type / in the script editor, pick "Pause", and choose a duration (300ms to 3 seconds). A small...
Week 1 post-launch: what broke, what surprised us, what we shipped
You'd think I'd be ready for launch week chaos. I was not. Vois launched here on March 5. Here's the honest recap. The numbers: 99 upvotes, #13 for the day 116 followers 9 comments on the launch post ~50 downloads in Week 1 First Product Hunt review received What broke: A customer reported that script content disappeared after saving. That's the kind of bug that makes your stomach drop. We...



Launching Vois on Thursday 5th March β a desktop voice AI studio
Hey PH community, I'm launching Vois on Thursday β it's a desktop voice AI studio I've been building as a solo maker for the past year. Some of you may have seen my earlier threads here about voice production costs for game devs, podcast workflows, audiobook production, and accessibility. Those conversations directly shaped what I built. The short version: 63 AI voices, voice cloning, 23...
Text-to-audio for accessibility β where are the gaps?
I'm partially dyslexic. Long text has always been difficult for me β not impossible, just slow enough that by the time I reach the bottom of a page, the top has faded. Since high school, I've been converting articles, papers, and reports to audio so I could actually absorb them. Over the years I've tried everything: screen readers (functional but robotic), browser extensions (limited), cloud...
Local-first AI vs cloud AI β which is winning for voice generation?
Most voice AI services β ElevenLabs, PlayHT, Murf β run in the cloud. You upload your text, they generate audio, you download it. Per-character pricing. But there's a clear shift toward local-first AI happening across the board. Apple's MLX framework, Ollama for LLMs, Whisper.cpp for transcription. Models are getting small enough and hardware is getting fast enough that "run it on your own...
How are L&D teams handling voice for e-learning content?
Enterprise learning and development teams produce a staggering amount of audio content β onboarding modules, compliance training, product walkthroughs, internal communications. And most of it needs to be updated quarterly or annually. The traditional workflow is painful: Script changes require re-recording (book the studio, schedule the narrator, wait for delivery) Multi-language versions...
Has anyone self-produced an audiobook with AI voices?
The audiobook market is growing fast β something like 25% year-over-year β but production costs are still a major barrier for independent authors. Professional narration typically runs $200-400 per finished hour. A 10-hour audiobook? That's $2,000-4,000 before editing and mastering. For self-published authors who might sell 100-500 copies, the math is brutal. AI narration is the obvious...
The faceless YouTube channel trend β what voice solution are creators actually using?
Faceless YouTube channels are everywhere now. Finance explainers, tech reviews, history deep dives, true crime, Reddit compilations β millions of views, no face on camera. The voice is the entire brand for these channels. And from what I can see, creators are split between a few approaches: Recording their own voice β works but takes time, needs decent equipment, and not everyone likes their...

