

24·
1 day agoThe control group spent time practicing, and the AI group just watched the AI solve problems. The performance gap can potentially be described by the efficacy of practice alone. But the increase in skipped problems is a good illustration of cognitive offloading gone awry. Too bad the researchers didn’t ask them why they chose to skip.

“Role-playing machine” is where it seems like the research is ending up. Language always has an implied communicator, and therefore an implied persona to adopt. LLMs are foremost maintaining a contextual role. Post-training is an attempt to keep them in the Assistant role, but (particularly as contexts get large) it’s trivial to push them into nearly any role imaginable. We made an improv bot that’s so good at playing a coder that it can actually code, kinda.