← kwindla hultman kramer

Thinking today about how far voice AI has come over the past year

December 6, 2024

Thinking today about how far voice AI has come over the past year.

On Dec 5 last year we released Talk to Santa Cat - iOS and Android voice AI apps that let you (or maybe a child in your life) talk to a "cat who lives in Santa's workshop."

As far as I know, Santa Cat was the first talk-to-an-AI-Santa experience.

The cat theme was a result of iterating on Santa voices and keyframe animations ... and feeling like every attempt to ship "Santa" fell into the uncanny valley. Just weird enough to be too weird. Not quite weird enough to be interesting.

But an 8-bit Santa's Cat with a squeaky voice — pretty fun!

And GPT-4 has always been great at performing gentle sassiness and puns, so we leaned into that with the Santa Cat prompting.

(Props to @petehawkes for the design work, including all the many iterations required to get to the sweet spot.)

This was state-of-the-art voice AI at the time. And every kid who tried it loved it.

But compared to voice agents today, everything in the Santa Cat app feels so much slower and clunkier.

The core components are not necessarily different from what you'd use to build a production conversational voice app today. The voice was @ElevenLabs. The LLM was @OpenAI GPT-4 (there was no 4o, yet). The speech-to-text was @DeepgramAI . The orchestration layer was an early version of @pipecat_ai.

However:
- No interruption handling.
- VAD is significantly worse.
- Latency was just bad enough that after a bunch of user testing with kids we decided the app needed audio cues to signal “bot thinks you stopped speaking” and “bot stopped speaking and expects you to talk now.”

All the models have gotten so, so, so much better over the past year! Faster. More capabilities. More predictable behavior across a wider range of use cases. More choices/competition in every category.

The app is still in the app store. You can download it if you want to try a blast from the (not too distant) voice AI past ...

Here's the landing page we threw together for the launch, with links to download the apps[1]

YouTube video of the app in action[2]

  1. https://www.santacat.ai/
  2. https://www.youtube.com/watch?v=kczVAnsOqJ0