← kwindla hultman kramer

Open source, native audio turn detection πŸŽ‰πŸŽ‰πŸŽ‰

March 6, 2025

Open source, native audio turn detection πŸŽ‰πŸŽ‰πŸŽ‰

Most voice agents today do turn detection by waiting for speech pauses of a specific, short length. That's not how humans do turn detection when we talk to each other!

I've been working with some friends on a new turn detection model. If you're interested in this problem or in learning more about ML engineering, come hack on a small model with us!

More details below.

GitHub repo[1]

Model on Hugging Face[2]

The high-level project goal is to build a state-of-the-art turn detection model that:

➑️ Anyone can use,
➑️ Is easy to deploy in production,
➑️ Is easy to fine-tune for specific application…

Some people found this last night before we had a chance to post about it!

I think this model fills an important gap in the voice AI tech stack right now. I'm excited about where we can get if lots of people contribute data, fine-tunes, and code improvements.

https://t.co/TgQFAFvfU6

  1. https://github.com/pipecat-ai/smart-turn ↩
  2. https://huggingface.co/pipecat-ai/smart-turn ↩