mit-tech-review
PublishedMarch 31, 2026 at 12:01 PM
AI benchmarks are broken. Here’s what we need instead.
Publisher summary· verbatim
For decades, artificial intelligence has been evaluated through the question of whether machines outperform humans. From chess to advanced math, from coding to essay writing, the performance of AI models and applications is tested against that of individual humans completing tasks. This framing is s
Discussion
No replies yet. Be first.
Related coverage
More from MIT-TECH-REVIEW
mit-tech-reviewRebuilding the data stack for AI1hOriginally published on mit-tech-review ↗