MAI-Transcribe-1 - Production ASR for noisy multilingual audio
MAI-Transcribe-1 is Microsoft’s new multilingual speech-to-text model built for real-world audio. It delivers best-in-class accuracy across 25 languages, strong robustness in noisy environments, faster batch transcription, and pricing aimed at production speech workflows.


Replies
Flowtica Scribe
Build Check
I need to try ASR and it's perfect for me. Thanks Zac for hunting it! Feel I gonna love it
Congrats on the launch! 👏
Also launching today — curious, what worked best for you to get your first users?
Multilingual ASR is a hard problem — especially for noisy audio. We deal with this at NexClip AI too, where accurate timestamps on every word are critical for topic-based video editing. Curious how MAI-Transcribe-1 handles word-level timestamp accuracy across languages?
I run Whisper in prod for a voice input thing — accents and background noise break it constantly. If this actually handles noisy multilingual audio better, that alone is worth switching. $0.36/hr is solid too. Gonna try it this week.