
Sun
Collaborative voice API for agents
197 followers
Collaborative voice API for agents
197 followers
Sun is a voice first AI model built specifically for real‑time collaborative voice interaction, not just one‑on‑one chat. ChatGPT Realtime and Gemini Live were built for one user talking to AI. Sun is built for collaboration — meetings, group calls, multi-agent debates, classrooms. One API, multi-speaker awareness, 10× the context window.









Refocus
The "when to interrupt" part is exactly where most voice infra falls down. I'm building voice AI for older adults, and barge-in plus handling slow or overlapping speech is the hardest piece, models either talk over people or freeze. Curious how Sun handles turn-taking when speakers have very different pacing, and whether latency holds up with 4-5 voices in the room?
@igorgurovich First and foremost, sun has trigger words and name option, which only wakes the model up to answer when required. It monitors transcription streaming to see if someone is speaking, and waits until it is complete (no more streamed text). Then, it decides to speak. And even after this, there could be incidents of people speaking over, which is handled by interruption handling, where model either decides to let the person continue or resumes the talk if the other person interrupted accidentally. Only a combination of these can actually help improve the experience.
how does it know when to interrupt and when to not? Also, is there way to make it speak only when explicitly asked?
@ashishkingdom Yes, you can actually give trigger words or names to the agent. The agent will then respond only when called with the name (and is asked a question), not when passively mentioned in a conversation. It also looks at voice activity to see if someone is actively speaking to avoid interrupting them.
So, is this like speech engine for note takers like otter or fireflies ? like making them talk back. fireflies does that now, but it takes forever to make it work.
@arjun_reghu Yes, absolutely. This will help with in-meeting intelligence (like meeting assistant) and is faster than the fireflies version. It can respond in voice in a very short time. For comparison, for a smaller answer, fireflies will start speaking after the sun agent has completed the answer, or is halfway. Take a look into our playground where you can try it out.
Sun
@aswanth_viswanathan8 Yes, it is possible to add context mid-session using context.update. It could be useful if you want the agent to have certain information, like sales number or a context about the ongoing discussion, in the middle of discussion.
@aisha_m_a Yes, the trigger words list is for alternate names of the agent. Agent will see it as a wake up call to answer. It is also helpful if the transcription service used sometimes misses to transcribe properly. For instance, John can be sometimes written as Jon and this could help in those scenarios as well.
The classroom use case is the one that caught my attention. An AI that can follow multiple speakers and participate at the right moment could be genuinely useful during group discussions.
Have you tested it with actual teachers or students yet? I'd be curious to hear how it performs in a real classroom setting.
Congrats on the launch!
@jared_salois Yes, this is build for multiple speakers being present, such as inside a group call or meeting. It is designed to be polite and avoid cutting off people and to allow interruptions. We have tested the interruption and multi speaker handling as well. But if people in the call keeps interrupting the voice, it will wait for them to complete speaking by being polite.