CS231n: Deep Learning for Computer Vision

Stanford - Spring 2026

What You'll Learn

Official Source

Deep Learning Fundamentals

Data-driven machine learning
K-Nearest Neighbors (KNN)
Linear classifiers
Softmax loss
Regularization techniques
Gradient Descent & Stochastic Gradient Descent (SGD)
Optimization algorithms: Momentum, AdaGrad, Adam
Learning rate scheduling

Neural Networks

Multi-Layer Perceptrons (MLPs)
Backpropagation
Computing gradients efficiently
Training deep neural networks

Computer Vision with CNNs

Convolutional Neural Networks (CNNs)
Convolution and pooling operations
Feature extraction from images
Transfer learning
Batch Normalization
Famous architectures:
- AlexNet
- VGGNet
- ResNet
- GoogLeNet

Sequence Models & NLP

Recurrent Neural Networks (RNNs)
LSTM and GRU
Language Modeling
Image Captioning
Sequence-to-Sequence Models

Transformers & Attention

Self-Attention Mechanism
Transformer Architecture
Vision Transformers (ViT)
Modern foundation models

Advanced Computer Vision

Object Detection
Image Segmentation
- Semantic Segmentation
- Instance Segmentation
- Panoptic Segmentation
YOLO and R-CNN family
DETR (Transformer-based Detection)

Video Understanding

Video Classification
3D CNNs
Two-Stream Networks
Multimodal Video Analysis

Large-Scale AI Training

Distributed Training
Model Parallelism
Data Parallelism
Activation Checkpointing
GPU Utilization Optimization

Self-Supervised Learning

Contrastive Learning
Pretext Tasks
Learning without labels
DINO and modern SSL methods

Generative AI

Variational Autoencoders (VAEs)
Generative Adversarial Networks (GANs)
Autoregressive Models
Diffusion Models (used in modern image generators)

3D Vision

3D Shape Representations
Shape Reconstruction
Neural Implicit Representations
Scene Understanding

Vision + Language

Connecting images and text
Multimodal AI systems
Foundations of systems like image-captioning and visual assistants

World Models & Future AI

World Modeling
Environment Understanding
Agent-based AI concepts

Responsible AI

Human-Centered AI
AI ethics
Building AI systems that work well with humans

Practical Skills

Python for Deep Learning
NumPy
PyTorch
Training and debugging neural networks
Building an end-to-end computer vision project
Research paper reading and implementation