Open source, native audio turn detection 🎉🎉🎉

March 6, 2025

Most voice agents today do turn detection by waiting for speech pauses of a specific, short length. That's not how humans do turn detection when we talk to each other!

I've been working with some friends on a new turn detection model. If you're interested in this problem or in learning more about ML engineering, come hack on a small model with us!

More details below.

GitHub repo^[1]

Model on Hugging Face^[2]

The high-level project goal is to build a state-of-the-art turn detection model that:

➡️ Anyone can use,
➡️ Is easy to deploy in production,
➡️ Is easy to fine-tune for specific application…

Some people found this last night before we had a chance to post about it!

I think this model fills an important gap in the voice AI tech stack right now. I'm excited about where we can get if lots of people contribute data, fine-tunes, and code improvements.

https://t.co/TgQFAFvfU6