
Grok
The world’s smartest AI (according to Elon)
4.6•12 reviews•1.9K followers
The world’s smartest AI (according to Elon)
4.6•12 reviews•1.9K followers
Grok is a free AI assistant designed by xAI to maximize truth and objectivity. Grok offers real-time search, image generation, trend analysis, and more.
This is the 8th launch from Grok. View more
Grok Voice API
Launching today
Grok now offers standalone Speech-to-Text and Text-to-Speech APIs for developers. The new voice stack covers real-time and batch transcription, multispeaker diarization, multichannel audio, text formatting, expressive TTS with speech tags, multilingual support, and simple usage-based pricing.







Free Options
Launch Team





ChatGPT by OpenAI
Claude by Anthropic
Flowtica Scribe
Hi everyone!
With the new transcription (Speech-to-Text) API now available, combined with their Voice Agent capabilities, it’s clear that @Grok is making a systematic push to capture the entire Voice AI ecosystem.
Looking specifically at the STT model, they have shipped a highly pragmatic feature set. It includes native WebSocket support for real-time streaming, built-in speaker diarization (a must-have for meetings), and intelligent text formatting that automatically handles numbers and currencies (it's cool and pretty useful in production!).
The pricing is also very aggressive: $0.10 per hour for batch and $0.20 per hour for streaming. xAI is once again putting some real price pressure on the market, isn't it?
@zaczuo How are you all handling noisy real-world audio? Does the streaming hold up, or is batch still king for cleaner results?
the multispeaker diarization built right into the STT is a nice touch — that's usually a painful separate step. how's the latency on the real-time streaming? would love to see benchmarks vs whisper and deepgram