← kwindla hultman kramer

How to build the world's fastest voice AI bot:

June 27, 2024

How to build the world's fastest voice AI bot:

- Self-host speech-to-text, LLM inference, and text-to-speech all together in the same container/cluster.
- Route audio over the internet using WebRTC and edge networking.
- Configure timings for voice activity detection, phrase endpointing, and other parts of the pipeline to optimize for latency. (There are trade-offs to doing this!)

Here's a LLama 3 voice bot that has voice-to-voice response times of ~500ms.

We used @DeepgramAI's STT and TTS for this bot, and everything is hosted on @cerebriumai's serverless GPU infrastructure.

Live demo and link to source code here[1]
Technical write-up here[2]

HN discussion[3]

  1. https://fastvoiceagent.cerebrium.ai/
  2. https://www.daily.co/blog/the-worlds-fastest-voice-bot/
  3. https://news.ycombinator.com/item?id=40805010#40806795