
MAI
Microsoft's top-tier model family
72 followers
Microsoft's top-tier model family
72 followers
Microsoft AI is pioneering the future of what AI can do and what technology can be.
This is the 4th launch from MAI. View more
Microsoft MAI-Voice-2
Launching today
Microsoft's most expressive TTS model yet — voice cloning from short samples, fine-grained emotional control, and consistent voice identity across 15 languages. Now live in Azure AI Foundry at $22 per million characters, with integrations rolling out in VSCode, Dynamics 365 Contact Center, and Teams. For builders shipping voice agents who need production-grade prosody without the OpenAI Realtime API price tag.





Free
Launch Team

Refocus
The consistent voice identity across 15 languages is what stands out to me here. I work on a voice companion that calls aging parents every day, and a lot of our families are immigrants whose parents are most at ease in their first language. A warm, familiar voice that holds up in Tagalog or Mandarin is often the difference between a call someone looks forward to and one they let ring out. Question for the team: how stable is the cloned identity and emotional control over a full 10-minute conversation, or does the prosody drift toward neutral as the session runs longer?