Build GPT-2 from Scratch: Architecture, Tokenization, and Training

A first-principles guide to building GPT-2, from tokenization and embeddings to transformer training.

March 28, 2026 · 2 min · 283 words · Brice

Deep Learning Explained with Mathematics: From Linear Regression to LLMs

Build deep learning intuition from first principles, progressing from linear models and backpropagation to transformers, diffusion, and LLMs.

March 8, 2026 · 15 min · 3138 words · Brice