Live AI avatars from one photo — optimized for reproducibility and low runtime cost
PhotoCall turns one portrait photo into a reusable speaking avatar for AI products, websites, onboarding flows, tutors, support, interview practice and product demos.
The important part is reproducibility: the same character can be reused across conversations, roles, scenes, and product workflows instead of generating a new random-looking avatar every time.
Different from real-time video generation, PhotoCall does most of the video work once during preprocessing and reuses avatar states during the conversation.
That creates a clear trade-off:
runtime video cost can stay very low
no GPU is needed during conversations
the same avatar identity can be reused across use cases
avatar creation currently takes ~15 minutes
lip sync is not as flexible as fully generated video
The bet: if realistic avatars become cheap and reproducible enough to run like a normal product layer, many more AI product experiences become viable.
I’m still tightening creation speed, lip sync, and playback stability, but the core pipeline is working and I hope to open the first builder access soon.
Looking for honest builder feedback.
Replies