← kwindla hultman kramer

You had me at lower latency!

September 26, 2025

You had me at lower latency!

Really nice work from the @GroqInc team. I like the got-oss-20B model for classification tasks, tool routing, and cleaning up voice transcription input before sending it to Claude Code or Codex! https://t.co/cmcbwXRYrX

Hatice Ozen@ozenhati

PSA: Prompt caching is now live for openai/gpt-oss-20b on @GroqInc.

→ 50% discount for cached tokens ($0.05/million cached input tokens)
→ Lower latency
→ Automatic prefix matching

We'll roll this out to more models to enable inexpensive, fast vibe coding for you all. ⚡️

Video from @ozenhati's post