Convolutional Neural Networks (CNNs)

This interactive demo shows how Convolutional Neural Networks process and classify handwritten digits from the MNIST dataset. You'll explore the internal workings of a real CNN, watching as data flows through convolutional layers, pooling layers, and fully connected layers.

The demo uses actual PyTorch-trained weights, letting you see how feature maps evolve from random noise to meaningful patterns as the network learns through training epochs.
CNN Architecture Components:

Convolutional Layers: Apply filters to detect features like edges and shapes
Pooling Layers: Reduce spatial dimensions while preserving important information
Feature Maps: Visual representations of what each layer detects
ReLU Activation: Introduces non-linearity by setting negative values to zero
Softmax Output: Converts final layer to probability distribution over 10 digits

Key Concepts:
Hierarchical Learning: Early layers detect simple features, deeper layers combine them into complex patterns
Translation Invariance: CNNs can recognize patterns regardless of position in the image
Parameter Sharing: Same filter weights used across entire image reduces overfitting
How to Use:

• Use the numpad to select MNIST digits and see real CNN processing
• Click "Draw Your Own" to draw custom digits for classification
• Watch feature maps update as data flows through each layer
• Start at epoch 0 (random weights) and step through training
• Click "Train Epoch" to advance one training epoch at a time
• Observe how feature maps evolve as the network learns

CNN Architecture:
Input Layer: 28×28 grayscale digit images
Conv Layer 1: 4 filters (3×3) → ReLU activation
Max Pool 1: 2×2 pooling → reduces spatial size
Conv Layer 2: 8 filters (3×3) → ReLU activation
Max Pool 2: 2×2 pooling → further size reduction
Flatten: Convert 2D feature maps to 1D vector
Dense Layer: 128 neurons → ReLU activation
Output Layer: 10 neurons → Softmax (digit probabilities)
Understanding Feature Maps:

Early Layers (Conv1): Look for edge detection and basic shapes
Deeper Layers (Conv2): Watch for more complex patterns and digit components
Brightness = Activation: Brighter pixels indicate stronger feature activation
Training Progress: Feature maps become more structured as training progresses

Observing Learning:
Epoch 0: Random weights produce noisy, meaningless feature maps
Early Epochs (1-5): Basic patterns start to emerge
Later Epochs (10+): Clear, structured features that respond to specific digit characteristics
Final Epochs: Highly specialized features for accurate digit classification

Drawing Tips:
• Draw digits clearly in the center of the canvas
• Use bold strokes - thin lines may not be detected well
• Try different digit styles to see how the network responds

Input Image

Prediction: -

CNN Architecture

Feature Maps at Different Layers

After Conv1 (4 filters)

After Pool1

After Conv2 (8 filters)

After Pool2

FC1 (128 units)

Output Probabilities

0: 0.00
1: 0.00
2: 0.00
3: 0.00
4: 0.00
5: 0.00
6: 0.00
7: 0.00
8: 0.00
9: 0.00
Epoch: 0/20
Loss: 2.303
Accuracy: 10.0%