Excellent on both fronts:
Timestamps: Word-level precision. Each word gets an exact timestamp, making it perfect for subtitles, searchable transcripts, or syncing text with audio.
Diarization: Industry-leading. Scribe can identify and label up to 32 different speakers in a single audio file—crucial for meetings, interviews, or multi-participant calls.
Bonus feature: Audio event tagging. It also detects non-verbal sounds like laughter, applause, and background noise, adding context markers directly into the transcript.
One limitation: Out-of-the-box diarization works best on shorter files (under 8 minutes originally), though workarounds exist for longer recordings.
For production voice applications, the combination of accurate timestamps and reliable speaker identification is a major advantage.