← kwindla hultman kramer

LLM selective response

July 24, 2025

LLM selective response ...

If you're building a voice agent yourself, you can achieve this (mostly) with a combination of prompting and orchestration logic.

Basically, "The user will sometimes tell you not to respond. When that happens, the only thing you should do is [ call this tool | output this specific token ]."

Then you need to define either a tool for the model to call that means "cool, I intentionally didn't respond yet," or special token handling in your processing pipeline.

I've built a few versions of this in @pipecat_ai, including one that I still use for personal voice note-taking. I will try to clean up that code and post it this weekend.

An interesting variant is to prompt the LLM to decide on its own when it shouldn't respond, even if you don't explicitly ask it not to respond. My experience trying to get various models to do this in a way that matches my expectations has been mixed! But I think with the best multi-turn models now (GPT-4.1 / Gemini 2.5 Flash) if you put some work into this today, you'd get pretty far.

kyle morris@kylejohnmorris

no voice AI I've tried has handled this case yet

me: "Hi, i'm going to think out loud, please don't respond until I ask"

AI: "sounds good!"

me: thanks! so anyway... i've been thinking...

AI: "Absolutely, That's a great idea..."

me: I just told you not to respond until I ask