← kwindla hultman kramer

We are using next token prediction in the most insane ways, these days

January 16, 2025

We are using next token prediction in the most insane ways, these days. (I do not mean that as a criticism. I am in the “we” set.) https://t.co/vi8tV4hNHa

Tom Dörr@tom_doerr

When I use an LLM as a judge, I ask a True/False question and do (p(True)+p(true)) - (p(False)+p(false)). This works very well for me, but I don't think I've seen anyone else scoring answers this way