Andrej Karpathy devastates AI optimists...

TLDR

Andrej Karpathy predicts AGI is over a decade away, emphasizing the 'decade of agents' due to significant remaining work in AI scaffolding, generalization, and overcoming current reinforcement learning limitations.

Takeways

• AGI is more than 10 years away; the current era is the 'decade of agents' due to extensive development required.

• Current LLMs learn by 'ghost-like' memorization, needing to evolve towards 'animal-like' generalization for true AGI.

• Reinforcement learning has inefficiencies, necessitating new learning paradigms like 'system prompt learning' and a focus on a 'cognitive core' for generalization.

Andrej Karpathy, a prominent AI figure, believes AGI is at least 10 years away, countering common optimistic timelines. He clarifies that the current period is the 'decade of agents,' not just a 'year,' due to the extensive integration and foundational work still required to make AI agents truly valuable and widespread. Karpathy highlights the need for AI learning paradigms to evolve beyond memorization and addresses the inefficiencies and challenges within current reinforcement learning approaches.

AGI Timelines & Agents

• 00:00:45 Andrej Karpathy asserts that AGI is 10-plus years away, defining 2025-2035 as the 'decade of agents' rather than a single 'year of agents.' This distinction reflects the significant amount of 'scaffolding' and foundational work needed for agents to become truly usable, valuable, and ubiquitous in the economy. He sees himself as a moderate between AI pessimists and optimists, acknowledging rapid progress in LLMs but also the substantial integration, physical world interaction, and societal challenges yet to be addressed before AI can reliably outperform humans in arbitrary jobs.

Animal vs. Ghost Learning

• 00:05:59 Karpathy distinguishes between animal-like and ghost-like learning, suggesting that current LLMs learn more like 'ghosts' by predicting the next token over the internet, a form of intelligence accumulation distinct from biological evolution. Animals, including humans, are 'pre-packaged with a ton of intelligence by evolution,' enabling complex behaviors like a newborn zebra walking immediately, which cannot be simply replicated by algorithms alone. He argues that LLMs primarily rely on memorization, and true generalization, which is crucial for AGI, requires moving towards more 'animal-like' learning mechanisms.

Critique of Reinforcement Learning

• 00:08:45 Karpathy expresses skepticism about current reinforcement learning (RL) methods, citing their inefficiency due to a poor 'signal per flop' ratio, meaning minimal learning per computational effort. He highlights the problem of outcome-based rewards, where entire thought processes, including erroneous intermediate steps, might be wrongly reinforced if they lead to a correct final answer. Even process supervision faces challenges, as correct intermediate steps might be penalized if the ultimate outcome is wrong, suggesting a need for alternative learning paradigms beyond standard RL.

System Prompt Learning & Cognitive Core

• 00:11:13 Karpathy proposes 'system prompt learning' as a new learning paradigm where models improve by internalizing problem-solving strategies and general knowledge, akin to humans taking notes for themselves. This concept aims to store global problem-solving knowledge and strategies rather than just user-specific facts, overcoming the limitations of finite context windows. He also introduces the 'cognitive core,' advocating for smaller, highly capable models that prioritize generalization over encyclopedic knowledge, actively stripping away excessive memorization to foster true reasoning abilities.