December 21, 2024
It was really fun seeing @swyx show his voice-driven programming demo live, after watching it as a video a few weeks ago.
Seeing the demo live prompted another round of thoughts about Sean's experiments and how they point towards one possible branch of a future UX for programming. (Maybe this is a little bit like the difference between listening to recorded music and seeing a band play live!)
This time around, I had two reactions.
First, Marvin Minsky used to talk about his programming process as:
1. Start with an empty REPL
2. Imagine that the program you have in mind is completely finished.
3. Type the command that runs that program.
4. You will get an error. (The first error will be, "that function doesn't exist.")
5. Fix the error.
6. Repeat from (3) until your program works.
What Sean has built feels like an instantiation of Marvin's process.
But with bigger building blocks. Because the LLM can generate a lot of code per turn.
And with higher-bandwidth input. Because you can talk more fluidly than you can type. And the LLM can understand and accomodate for backtracking and correcting yourself, adding onto a previous idea, etc.
Second, for more than a year, I've been hacking on and using an evolving series of personal, voice-driven, LLM-powered, programming "assistants." But Sean has pushed beyond what I've built. I was clearly not exploring the full range of possibilities, here! It's nice when somebody knocks you out of a rut and changes your perspective.
Now on to @swyx talking voice agents with a shoutout to @kwindla @ninacali4

Here's the video of the original talk[1]