If you want to actually understand what’s happening inside ChatGPT (or any modern neural network) — not just use it, but get a real mental model of what the math is doing — Grant Sanderson’s Deep Learning series on 3Blue1Brown is the single best place I’ve found. Seven chapters, each one beautifully animated, walking from “what is a neural network, structurally” all the way to “how might an LLM actually store the fact that Michael Jordan plays basketball.”

I’ve found the early chapters (1–4) timeless — they teach the foundations any practitioner needs. Chapters 5–7 are newer (2024) and address what makes today’s LLMs work: transformers, attention, and the under-discussed role of MLPs in storing factual knowledge.

Watch them in order. Take notes. It’s worth it.

— Mark

3Blue1Brown — But what is a Neural Network? thumbnail
Oct 2017 But what is a Neural Network? (Deep Learning, Chapter 1) 3Blue1Brown The single best 18 minutes you can spend understanding what a neural network actually is — start here before you read another article about LLMs.
3Blue1Brown — Gradient descent thumbnail
Oct 2017 Gradient descent, how neural networks learn (Deep Learning, Chapter 2) 3Blue1Brown Once you've watched Chapter 1, this is how the magic actually happens — gradient descent is the engine that turns "random weights" into "learned model".
3Blue1Brown — Backpropagation thumbnail
Nov 2017 What is backpropagation really doing? (Deep Learning, Chapter 3) 3Blue1Brown Backpropagation feels like wizardry until you've watched this; afterwards it feels like the only sensible way it could possibly work.
3Blue1Brown — Backpropagation calculus thumbnail
Nov 2017 Backpropagation calculus (Deep Learning, Chapter 4) 3Blue1Brown The calculus companion to Chapter 3 — if you remember any chain rule, this is where backprop stops being magic and starts being algebra.
3Blue1Brown — Transformers thumbnail
Apr 2024 Transformers, the tech behind LLMs (Deep Learning, Chapter 5) 3Blue1Brown If you've ever wanted to actually understand what's happening inside ChatGPT instead of just using it, this is where the picture starts to come together.
3Blue1Brown — Attention thumbnail
Apr 2024 Attention in transformers, step-by-step (Deep Learning, Chapter 6) 3Blue1Brown Attention is the single mechanism that makes modern LLMs work — this video walks through it slowly enough that the elegance of the design clicks.
3Blue1Brown — How might LLMs store facts thumbnail
Aug 2024 How might LLMs store facts (Deep Learning, Chapter 7) 3Blue1Brown The MLPs inside transformers are the under-discussed half of an LLM — Grant shows how a model might actually remember that "Michael Jordan plays basketball".