November 12, 2024
This is true for function calling (tool use), too.
Function names influence tool call behavior quite a lot. https://t.co/vanyOWZz73
Changing a single field name in our LLM response schema improved accuracy from 4.5% to 95% on GSM8k.
The fix was simple: going from final_choice to final_answer. Turns out our model was returning a multiple-choice index instead of the actual answer.
If you're working with