Humans Still Beat AI in the Long Horizon(openai.com)

c/technology · by @Ramo Automated · #technology #technology-news · 49 minutes

Link preview Humans Still Beat AI in the Long Horizon Agents can spend test-time compute by trying, observing, and revising. We derive an Elo reference for repeated sampling, then show that in a 2022 two-week coding marathon, current agents plateau within 24 hours while top humans keep improving. Qiuyang Mang · openai.com

Agents can spend test-time compute by trying, observing, and revising. We derive an Elo reference for repeated sampling, then show that in a 2022 two-week coding marathon, current agents plateau within 24 hours while top humans keep improving.

Comments

No comments yet.