.@shresbm and I gave a talk at @aiDotEngineer World's Fair about building…

June 18, 2025

.@shresbm and I gave a talk at @aiDotEngineer World's Fair about building real-world voice agents that leverage the most advanced features of today's models, APIs, and frameworks.

The themes of the talk were:
- What the moving parts involved in building production voice agents are today.
- What's hard and what's easy right now.
- How functionality is distributed between application code, libraries/frameworks, APIs, and models. And how that's changing over time.

During the talk we did a live demo using the latest Gemini 2.5 Flash Preview model and Live API. The demo showed:
- The model following a complex system prompt and doing multi-step reasoning
- Cross-session memory. (Thank you, @supabase.)
- Web search
- The model dynamically formatting text for on-screen display, and calling a function to trigger text display updates.
- One-shot code generation for dynamic UI

Here's the video that Shrestha recorded in case we had network issues or any other tech problems, and couldn't do the demo live. (But we did do the demo live!)

Link to code in the thread.

Github repo^[1]

If you're doing the voice AI course with us, Shrestha will be hosting an office hours session about the Gemini models and APIs on Wednesday at 9am PT.

https://github.com/kwindla/aiewf-gemini-preview-todo ↩