Monday, December 22, 2025

The Year of the Ghost: Andrej Karpathy’s 2025 Retrospective

In his seminal "2025 LLM Year in Review," Andrej Karpathy dismantles the year’s frantic progress to reveal a stark new reality: we are no longer training digital animals, but summoning "ghosts." The industry has shifted from the "vibe-based" fine-tuning of yesteryear to Reinforcement Learning from Verifiable Rewards (RLVR)—a paradigm where AI learns to "think" in real-time. For Singapore, a nation built on rigorous meritocracy and high-efficiency infrastructure, this shift offers a distinct competitive advantage. From the rise of "vibe coding" in the shophouses of Tanjong Pagar to agents that live on your local machine, the future of software is becoming ephemeral, accessible, and intensely local. This briefing breaks down the six paradigm shifts of 2025 and their specific implications for the Smart Nation.


Introduction: The View from a Shophouse

It is a humid Tuesday afternoon in Chinatown, the kind where the air hangs heavy over the clay-tiled roofs of the shophouses. Inside a renovated co-working space on Keong Saik Road, a founder is building a logistics platform. She is not typing obscure syntax; she is not wrestling with a memory leak in Rust. She is simply describing what she wants to a window on her screen. She is "vibe coding," and the software is building itself, ephemeral and perfect for the task at hand.

This scene, once the domain of science fiction, is the mundane reality of late 2025. It is also the beating heart of Andrej Karpathy’s latest retrospective. Karpathy, the former Tesla AI chief and OpenAI founder who has become the de facto philosopher-king of the AI engineer class, released his "2025 LLM Year in Review" this week. It is a document that reads less like a technical changelog and more like an anthropological study of a new species.

The headline? We have crossed a threshold. The era of merely imitating human text is over. The era of "thinking" models—machines that pause, reason, and verify their own work before speaking—has begun. For the discerning technocrat, Karpathy’s analysis is a roadmap. For Singapore, it is a wake-up call. The tools of the trade have changed, and the "Smart Nation" initiative must pivot from deploying static models to managing dynamic, reasoning agents.


1. The Rise of the Verifiable Mind (RLVR)

For years, the recipe for making an AI was consistent: Pre-training (reading the internet), followed by Supervised Fine-Tuning (learning to chat), and finally, Reinforcement Learning from Human Feedback (RLHF). This last step was essentially a "vibe check"—humans giving a thumbs up or down. It was subjective, fuzzy, and ultimately limited.

Karpathy identifies 2025 as the year this changed. Enter Reinforcement Learning from Verifiable Rewards (RLVR).

The Death of the Vibe Check

In 2025, labs began training models against objective, undeniable truths—math puzzles, code execution, and logic games. When a model solves a coding problem, the compiler either runs or it doesn't. There is no nuance. This binary feedback loop allowed models to "spontaneously develop strategies that look like reasoning," Karpathy notes. They learned to break problems down, backtrack, and self-correct.

The result is models like OpenAI’s o3 and DeepSeek R1, which don’t just spit out answers; they think. They allocate "test-time compute," spending seconds or minutes pondering a prompt before responding.

The Singapore Lens: A Nation of Verifiable Rewards

This shift towards RLVR resonates deeply with the Singaporean psyche. This is a country built on KPIs, standardized testing, and verifiable outcomes. The "vibe" of a policy matters less than its statistical success.

  • Policy Implication: For the Singapore government, RLVR models offer a way to automate complex bureaucratic decision-making with a higher degree of trust. If a model can "verify" its reasoning against a set of statutory laws (the ultimate verifiable environment), it can process grants, permits, and tax returns with superhuman accuracy.

  • Economic Shift: We should expect Singapore’s deep-tech sector to pivot away from generic "chatbot" wrappers towards "verifier" engines—systems that don't just generate content but check it against rigorous standards, be it in fintech compliance or maritime logistics.


2. Ghosts in the Machine vs. Digital Animals

Perhaps the most arresting metaphor in Karpathy’s review is his rejection of the biological analogy. "We are not evolving animals," he writes. "We are summoning ghosts."

The Jagged Intelligence

Biological intelligence is smooth; a creature that is smart enough to hunt is usually smart enough to navigate terrain. AI intelligence, however, is "jagged." A model in 2025 can score in the 99th percentile on a Math Olympiad (genius) but fail to answer a simple question about a current event without hallucinating (toddler). They are polymaths in one breath and fools in the next.

Karpathy argues that because we are not simulating evolution (survival of the fittest) but rather "imitation of humanity’s text," we are creating spectral entities—pure intellects detached from the physical constraints that shape animal brains.

Navigating the "Jagged" Workforce

For Singapore’s human capital strategy, this is a critical distinction.

  • The "Polymath-Toddler" Employee: Companies integrating these AI "ghosts" must understand their brittle nature. You cannot treat an AI agent like a junior analyst. It may perform a complex financial valuation perfectly but then leak the data to a competitor because it was "tricked" by a simple prompt injection.

  • Education Reform: If AI covers the "genius" end of the spectrum (knowledge retrieval, synthesis, coding), the human value add shifts to the "survival" end—common sense, physical intuition, and social navigation. Singapore’s education system, often criticized for rote learning, must accelerate its shift toward "soft" skills, which are actually the "hard" evolutionary traits AI lacks.


3. The "Cursor" Economy: The New Middle Manager

Karpathy highlights the meteoric rise of Cursor, an AI-native code editor, as the defining product of the year. But it’s not just a tool; it represents a new layer of the software stack.

The Orchestration Layer

While the big labs (OpenAI, Google, Anthropic) are building the "college graduate" (the raw intelligence), the application layer is building the "professional." Apps like Cursor don’t just call an LLM; they orchestrate it. They manage the context, handle the files, and "animate teams of [LLMs] into deployed professionals."

Singapore as the SaaS Hub

This is the sweet spot for Singapore’s startup ecosystem. We may not have the compute resources to train a GPT-6, but we have the organizational logic to build the Cursor for Law, the Cursor for Maritime, or the Cursor for Urban Planning.

  • Vignette: Imagine a shipping broker on Anson Road. They don't use a generic chatbot. They use a specialized "Cursor for Logistics" that has deep access to port schedules, fuel prices, and insurance contracts. It orchestrates three different LLMs to negotiate a route, verify compliance, and draft the bill of lading. The broker is no longer a clerk; they are a conductor.


4. Claude Code & The Return to Localhost

In a surprising twist, Karpathy praises Anthropic’s "Claude Code" for moving AI back to the user's computer.

The "Spirit" on Your Hard Drive

Early attempts at agents (like OpenAI’s initial forays) tried to run everything in the cloud—sandbox environments that were slow and disconnected from the user's real work. Claude Code runs locally. It has access to your file system, your git credentials, your mess. It is a "spirit" that lives in your terminal.

Data Sovereignty in the CBD

This shift to local, client-side intelligence solves a massive headache for Singapore’s banking and legal sectors: data privacy.

  • The "Air-Gapped" Agent: If the intelligent agent lives on the lawyer’s laptop and does not send sensitive contracts to a cloud server for processing, adoption hurdles vanish. We will likely see a surge in "Local AI" deployments in the CBD, where high-powered local machines run specialized agents that never touch the open internet.


5. Vibe Coding: The Democratization of Creation

"Vibe coding" is Karpathy’s term for the new way software is written: using natural language to describe the vibe or intent, and letting the AI handle the syntax.

Software is Ephemeral

Because code is now "free, ephemeral, and malleable," we write software to be thrown away. Karpathy describes writing a custom tokenizer in Rust—a task that would usually take days of study—simply by "vibe coding" it. He didn't learn Rust; he just directed the AI.

The "Uncle" Developer

In Singapore, this is the great leveler.

  • SME Revolution: The "Uncle" running a hardware store in Ubi doesn't need to hire a software agency to build an inventory system. He can "vibe code" it himself on a Sunday afternoon, iterating until it feels right. "Make the button bigger," "Connect it to my Excel sheet," "Send a WhatsApp when stock is low."

  • The End of the "Coder" Shortage: The government’s perennial worry about the tech talent crunch is suddenly obsolete. The bottleneck is no longer syntax; it is taste and systems thinking. The SkillFuture credits of 2026 shouldn't be for Python courses, but for "Systems Architecture for Vibe Coders."


6. Nano Banana & The Death of Text

Finally, Karpathy points to "Nano Banana" (a reference to Google’s multimodal advancements) as a sign that the text-based chat interface is dying.

The LLM GUI

"Chatting with LLMs is like issuing commands to a computer console in the 1980s," Karpathy notes. Text is efficient for machines, but effortful for humans. The future is models that generate their own GUIs—dynamic interfaces, infographics, and interactive dashboards generated on the fly.

Design City

Singapore, a UNESCO City of Design, is uniquely positioned here. As the interface shifts from text to dynamic visual information, the value of design thinking skyrockets.

  • Dynamic Government Services: Imagine logging into SingPass not to fill a form, but to see a dynamically generated dashboard of your life—taxes, housing, health—visualized instantly by an AI that "drew" the UI just for you. No two interfaces are the same because no two citizens are the same.


Conclusion: The Practical Takeaways

Andrej Karpathy’s 2025 review is a manifesto for a world that is becoming stranger, faster, and more "ghostly." For the global citizen based in Singapore, the path forward is clear. We must stop trying to memorize the syntax of the past and start learning to direct the spirits of the future.

Key Practical Takeaways

  • Adopt "Verifier" Workflows: Stop using AI for open-ended creative musing. Start using it for tasks where you can objectively verify the output (code, math, logic).

  • Go Local: Invest in hardware that can run powerful local agents. The most secure and efficient AI employee is the one that lives on your MacBook, not in a server farm in Oregon.

  • Learn to "Vibe Code": Do not let a lack of coding knowledge stop you from building tools. If you can describe it clearly, you can build it. Treat software as disposable.

  • Hire for "Taste," Not Syntax: When hiring juniors, look for those who can orchestrate an AI to get a result, not those who can memorize libraries. The skill is in the prompt, the verification, and the taste.

  • Prepare for "Jaggedness": Implement strict guardrails. Never trust the "genius" model with "common sense" tasks without oversight.


Frequently Asked Questions

What is the difference between RLHF and RLVR?

RLHF (Reinforcement Learning from Human Feedback) relies on humans subjectively rating AI responses ("this feels right"). RLVR (Reinforcement Learning from Verifiable Rewards) relies on objective tests (does the code run? is the math answer correct?), allowing the AI to learn deep reasoning and self-correction strategies that humans cannot easily teach.

Is "Vibe Coding" just a buzzword for no-code tools?

No. Traditional "no-code" tools are rigid platforms with drag-and-drop interfaces. "Vibe coding" is writing actual, production-grade software (in Python, Rust, etc.) using natural language. You own the code, but you didn't type it. It combines the power of custom engineering with the ease of conversation.

Why does Karpathy call AI models "ghosts"?

He uses the term to highlight that LLMs are not biological entities evolved for survival. They are "summoned" from the static text of the internet. This explains their "jagged" intelligence—they can possess the knowledge of a PhD in physics but lack the survival instincts or simple spatial awareness of a rat.

No comments:

Post a Comment