← kwindla hultman kramer

I ❤️ many things about this voice agent project from Ben Shababo

November 5, 2025

I ❤️ many things about this voice agent project from Ben Shababo ...

- The latency - faster than 1s voice-to-voice measured on the client!
The code is all open source *and* uses all open weights models.
- The fast generation with text, code snippets, and links all together in the app output while voice output happens in parallel.
- The demonstration of how easy it is to add "services" (APIs/models/tools/processors) to Pipecat.
- The nice use of @trychroma's vector database, enabling very fast retrieval of content.

Ben's blog post explains everything very clearly, including how he approached minimizing networking and inference latency.

Check out the video, the blog post, and the code if you're interested in voice AI or (as Ben says in the post) if you're interested more generally in designing and scaling realtime AI applications of any kind.

Modal@modal

Build a conversational voice bot with 1 second voice-to-voice latency with Modal, @pipecat_ai, and open models.

Modal works seamlessly with WebRTC, WebSockets, and tunneling to squash latency to an absolute minimum.

Video from @modal's post