Hi everyone👋 Excited to share FireRedASR, a new family of open-source, industrial-grade automatic speech recognition (ASR) models from FireRed Team of Rednote (Xiaohongshu) ! These models are achieving state-of-the-art results on Mandarin ASR benchmarks. Key highlights: 🏆 SOTA Performance: Outperforms previous models (like Seed-ASR) on public Mandarin datasets, with significantly lower Character Error Rate (CER). ✌️ Two Model Variants: 1. FireRedASR-LLM: Leverages an LLM for maximum accuracy. 2. FireRedASR-AED: Balances high accuracy with computational efficiency. 🌐 Excels in Mandarin, but also performs well on Chinese dialects and English. 🎤 Beyond Transcription: Even shows strong performance in singing lyrics recognition! 💻 Ready to Use: Includes pre-trained models and inference code. This is a big deal for anyone working with Chinese speech data – whether it's for voice assistants, transcription services, video subtitling, or other applications. The fact that it's open-source and includes both high-accuracy and high-efficiency models makes it incredibly versatile.

FireRedASR

Open-Source SOTA Speech Recognition from Rednote

Open-Source SOTA Speech Recognition from Rednote