Introducing the @aiDotEngineer World's Fair multi-turn conversation benchmark

September 12, 2025

Introducing the @aiDotEngineer World's Fair multi-turn conversation benchmark.

I wrote a post for the @AITinkerers newsletter about:
- why multi-turn is hard for LLMs,
- the particular challenges of audio/voice, and
- a benchmark I use to think about LLM performance for…

Full post here^[1]

https://post-training.aitinkerers.org/p/your-conversation-is-out-of-distribution ↩