March 6, 2025
Open source, native audio turn detection πππ
Most voice agents today do turn detection by waiting for speech pauses of a specific, short length. That's not how humans do turn detection when we talk to each other!
I've been working with some friends on a new turn detection model. If you're interested in this problem or in learning more about ML engineering, come hack on a small model with us!
More details below.
GitHub repo[1]
Model on Hugging Face[2]
The high-level project goal is to build a state-of-the-art turn detection model that:
β‘οΈ Anyone can use,
β‘οΈ Is easy to deploy in production,
β‘οΈ Is easy to fine-tune for specific applicationβ¦
Some people found this last night before we had a chance to post about it!
I think this model fills an important gap in the voice AI tech stack right now. I'm excited about where we can get if lots of people contribute data, fine-tunes, and code improvements.
https://t.co/TgQFAFvfU6