In a landmark discussion, Lex Fridman sits down with two of the most influential voices in modern machine learning: Sebastian Raschka, LLM Researcher and author, and Nathan Lambert, Post-training Lead at AI2. Together, they dissect the rapid evolution of artificial intelligence from the “DeepSeek Moment” of 2025 to the agentic reality of 2026. – AI State of the Union 2026.
The “DeepSeek Moment” and the Global Shift
Lex: We often look at AI through the lens of specific breakthroughs. Looking back from 2026, what was the “DeepSeek Moment”?
Nathan: It happened in January 2025. The Chinese company DeepSeek released R1, which surprised everyone by matching state-of-the-art performance with significantly less compute and cost. It shifted the narrative from “who has the most GPUs” to “who has the most efficient training algorithms.”
Sebastian: I agree. It won the hearts of the open-source community. Today, in 2026, we see that ideas aren’t proprietary anymore—researchers move between labs constantly. The real differentiator now is the sheer budget for hardware and the culture of the organizations building them. – AI State of the Union 2026
THE STATE OF AI IN 2026
Deep Dive: LLMs, Scaling Laws, RLVR, and the global race for AGI with Sebastian Raschka & Nathan Lambert.
The Global Landscape
The “DeepSeek Moment”
January 2025 marked a shift: Chinese models proving SOTA performance with significantly less compute.
Winning is temporary. Access to ideas is fluid; the differentiator is now Budget and Hardware.
Model Superstars
-
Claude Opus 4.5
Dominating coding and voice with extended reasoning features.
-
Gemini 3
Leveraging massive structural advantages and huge context windows.
-
GPT-5 / 5.2
Pushing thinking modes and complex agentic routing.
Education Roadmap
The best way to understand is to build it yourself.
The Technical Breakthroughs
Architecture Tweaks
Expands knowledge without proportional compute per token.
Optimizes KV cache size for larger context windows.
The New Training Paradigm
Reinforcement Learning with Verifiable Rewards using executable answers.
Using more compute during runtime reasoning to solve harder problems.
The Programming Evolution
Traditional (Cursor Style)
High control pair programming where humans remain the primary architects.
Agentic (Claude Code Style)
Programming with English where the AI manages files and commands.
The Architecture of 2026: Beyond the Transformer?
While the industry remains rooted in the Transformer, 2026 is defined by specialized “tweaks” that have unlocked massive context windows and unprecedented reasoning capabilities.
“We’ve moved beyond simple RLHF. The big breakthrough is RLVR—Reinforcement Learning with Verifiable Rewards.”
— Nathan Lambert
Lex: Sebastian, you’ve written about building LLMs from scratch. How much has the architecture actually changed since GPT-2?
Sebastian: Fundamentally, not as much as you’d think. It’s still the Transformer. But we’ve added “tweaks” that have massive scaling impacts:
- MoE (Mixture of Experts): This allows us to make models larger without increasing the cost of every single forward pass.
- MLA (Multi-head Latent Attention): This was huge for 2025 and 2026. It optimizes the KV Cache, making it cheaper to handle massive context windows.
Programming with English
The transition from writing syntax to “Programming with English” has fundamentally altered the day-to-day life of developers. The focus has shifted from the how to the what.
Lex: Nathan, you seem more bullish on the agentic side.
Nathan: Absolutely. Tools like Claude Code represent “Programming with English.” You don’t micromanage the lines; you guide the design at a macro level. The AI manages the repo, runs the CLI, and handles Git. It’s a different skill set—research taste and system design are now more important than syntax.
Sebastian: I’m still a bit of a control freak; I like to see the diffs. But I use it to automate the mundane tasks—fixing broken links, boilerplate, and refactoring. We have to find a “Goldilocks zone” where we use AI to be productive but still invest in our own mental frameworks. – AI State of the Union 2026.
The Shift in Technical Paradigms
| Feature | Pre-2025 Era | The AI State of the Union 2026 |
| Primary Goal | Parameter Count / GPU Hoarding | Algorithmic Efficiency / Compute per Token |
| Training | Human-led RLHF | RLVR (Verifiable Rewards) |
| Coding | Manual Syntax & Copilots | Agentic “English” Programming |
| Open Source | Following the Giants | Frontier-level Open Weights (DeepSeek, Qwen) |
AGI and the Legacy of Compute
Lex: Let’s talk about the “Singularity” or AGI. Where are we?
Nathan: There’s a document—AI 2027—that predicted a “Superhuman Coder” by next year. I think it’s a bit aggressive because AI is “jagged.” But 2031 seems like a reasonable mean prediction for a fully autonomous AI researcher.
Lex: 100 years from now, what will historians say?
Sebastian: They won’t remember the name “Transformer” or specific GPU models. They will look at this as the era where Compute became the primary engine of civilization, much like the steam engine was for the Industrial Revolution.
Nathan: I hope they see it as the time we democratized knowledge. Making the sum of human wisdom accessible to everyone, everywhere, for the cost of a few tokens.
Check Out: The Agentic Revolution: Inside the Mind of OpenClaw’s Peter Steinberger