I wrote parts of the function calling implementations and improvements that…

August 22, 2024

I wrote parts of the function calling implementations and improvements that just shipped in @pipecat_ai 0.40 and gee whiz I have a lot of thoughts about LLM tool use.

1. Claude 3.5 Sonnet, GPT-4o, and LLama 3.1 are all very good at function calling now. This makes whole categories of use cases that were "almost kind of possible" six months ago really achievable now. Kudos to @AnthropicAI, @OpenAI, and @AIatMeta, truly! 🚀

2. Tool use is particularly challenging for voice-to-voice applications.

For voice, you always want to be "streaming" the LLM responses. But function calling and streaming don't mix well.

Sonnet has the nicest affordance here, in some ways, because the model tells you in natural language what tool it's going to use (and why, usually) before it emits the function call block. You can just stream Sonnet's natural language response as you would stream any response, and then catch the function call block at the end and dispatch to your function.

But ... sometimes you might not want to speak the natural language response if you know a function call request is going to happen. Sadly, there's no way to distinguish between a response that's just text vs a response that will include a function call block at the end. It's also very hard to prompt Sonnet to be minimally chatty in its pre-function call natural language block.

3. As with all things LLM-related, prompting is an art. Specifying a `tools` block is necessary (for Sonnet and GPT-4o), but not sufficient.

4. Interestingly, Llama 3.1 leans into the importance of prompting and just says "put it all in the prompt, here are some best practices."

On the one hand, this makes sense given how LLMs are trained and how you need to approach AI engineering to build production-quality systems. On the other hand, doing things this way feels different enough that it's a bit jarring.

5. Related to (2) and (3) and (4), there are enough differences in how the major LLMs do function calling that switching between them is not trivial. @chadbailey59 has more thoughts on this than I do, and has some nice abstractions in some of his code. The Pipecat function calling support is designed to try to support these kinds of abstractions. More work to do here, though.

Pipecat AI@pipecat_ai

Lots of new things in 0.40.

✨ Function calling and prompt caching for @AnthropicAI Claude 3.5 Sonnet
✨ Llama 3.1 function calling support in the @togethercompute service
✨ A complete implementation of the RTVI standard
✨ Studypal, a new application example from the team at