└── README.md /README.md: -------------------------------------------------------------------------------- 1 | # A Robust Definition for Human-Level AGI 2 | 3 | Like many others, I've been amazed by the recent capabilities demonstrated by systems like OpenAI's O1 and O3. As someone deeply interested in technology and AI ([View My Projects](https://www.lindahl.works/#projects)), I've spent recent years trying to create tests that AI cannot pass. This endeavor has become increasingly challenging - and that's actually really interesting when you think about it. 4 | 5 | This challenge of creating tests is intimately connected to how we define human-level AGI. It's as if we've been trying to define it through our attempts to create benchmarks that AI can't solve. 6 | 7 | > **Human-Level AGI Definition:** 8 | > A system has reached human-level AGI when it becomes impossible for humans to create any benchmark where humans reliably outperform the system. 9 | 10 | Today, I launched a website called "[h-matched](https://h-matched.vercel.app/)" that tracks major AI benchmarks and how long it took for AI systems to reach human-level performance on each one. If you look at the data, you'll notice something fascinating - we're approaching a point where it's becoming incredibly difficult to create *any* test where humans can outperform the best AI systems. 11 | 12 |