└── README.md


/README.md:
--------------------------------------------------------------------------------
 1 | # A Robust Definition for Human-Level AGI
 2 | 
 3 | Like many others, I've been amazed by the recent capabilities demonstrated by systems like OpenAI's O1 and O3. As someone deeply interested in technology and AI ([View My Projects](https://www.lindahl.works/#projects)), I've spent recent years trying to create tests that AI cannot pass. This endeavor has become increasingly challenging - and that's actually really interesting when you think about it.
 4 | 
 5 | This challenge of creating tests is intimately connected to how we define human-level AGI. It's as if we've been trying to define it through our attempts to create benchmarks that AI can't solve.
 6 | 
 7 | > **Human-Level AGI Definition:**  
 8 | > A system has reached human-level AGI when it becomes impossible for humans to create any benchmark where humans reliably outperform the system.
 9 | 
10 | Today, I launched a website called "[h-matched](https://h-matched.vercel.app/)" that tracks major AI benchmarks and how long it took for AI systems to reach human-level performance on each one. If you look at the data, you'll notice something fascinating - we're approaching a point where it's becoming incredibly difficult to create *any* test where humans can outperform the best AI systems.
11 | 
12 | <div align="center">
13 | <img src="https://github.com/user-attachments/assets/0e351b1e-8de0-4417-b097-592736bdadcf" width="600" alt="Time to Human Level Trend visualization showing the decreasing gap between benchmark release and AI achieving human-level performance">
14 | </div>
15 | 
16 | This got me thinking about what happens when we extrapolate the trend line to where it hits zero (which looks to be around 2025). What would that actually mean in practical terms? From my perspective, it would indicate we've reached a point where we literally cannot create *any* type of task where humans perform better than AI systems.
17 | 
18 | What I find compelling about this definition is how it elegantly sidesteps the whole challenge of trying to define human intelligence or create the perfect benchmark. It's fundamentally empirical - either we can create a test where humans outperform the AI, or we can't.
19 | 
20 | Ignoring the potentially uncomfortable implications of reaching this point, I think this might be one of the most robust ways to define human-level AGI that I've encountered. Would love to hear your thoughts on this.
21 | 
22 | ---
23 | *Best regards,
24 | [Rasmus Lindahl](https://www.lindahl.works/)*
25 | 


--------------------------------------------------------------------------------