← kwindla hultman kramer

.@maxipesfix forked the open source audio Smart Turn model and added video!

January 17, 2026

.@maxipesfix forked the open source audio Smart Turn model and added video!

Smart Turn is a "turn detection" model, used in a conversational agent to decide when the agent should respond. The model, training data, and training code are all completely open source. When we built the first version of Smart Turn, enabling this kind of extention and collaboration is exactly why we wanted to make everything open source.

Maxim's blog post is super useful to read if you're interested in training multimodal models. It describes the design choices and technical details (3D ResNet, late fusion, two-stage training, inference runs on GPU in ~100ms). And all the code is available in the GitHub repo. Really great work.

Blog[1]
GitHub[2]

  1. https://lnkd.in/gMakVfZz
  2. https://lnkd.in/g9pBiret