← kwindla hultman kramer

The new Sonic-3 voice model from @cartesia_ai launched today

October 28, 2025

The new Sonic-3 voice model from @cartesia_ai launched today.

The big additions are increased emotional range, emotion steering, and laughter tags.

For example: <emotion value="curious" />

I wrote some quick demo code, prompting Gemini Flash to know about the emotion tags. You can hear the results in the video, and see the emotion tags in the Pipecat developer console. Code is below.

Here's the code I ran to record the video:
https://t.co/6qCTnXGnUq

Sonic-3 docs: https://t.co/a4DvJbjuEx

Some notes:

- If you shipped this to production, with a UI that shows transcripts to the user, you'd add a simple frame processor that strips the emotion and laughter tags out of the LLM text stream after they're parsed by the Pipecat `CartesiaTTSService`.

- My Gemini Flash prompting was very simple. You could definitely do some fun stuff here if you had a particular style valence in mind.

- Cartesia is training different voices to have different "dynamic range." (My wording, not theirs.) For a customer support agent, you'd want to keep the output relatively stable even for different emotional cues. But for a game NPC, you might use a voice with a lot of emotiveness!

- You can deploy this simple code to Pipecat Cloud for scaling/production, hook it up to Twilio, etc. Pipecat Cloud getting started: https://t.co/IipjlVtwPy

Cartesia's Sonic-3 launch and fundraising announcement[3]

Congratulations to the team on all the great work!

  1. https://gist.github.com/kwindla/4420060c747d2d78797b169a96e61f6a
  2. https://docs.cartesia.ai/build-with-cartesia/sonic-3/volume-speed-emotion
  3. https://x.com/krandiash/status/1983202316397453676