Backpropagation Algorithm
Backpropagation is the key algorithm that enables neural networks to learn from data. It efficiently computes gradients by propagating errors backward through the network, allowing us to update weights to minimize prediction errors.
In civil engineering applications, backpropagation trains networks to:
• Structural analysis: Learn relationships between loads and structural responses
• Material modeling: Predict material properties from composition and processing parameters
• Geotechnical prediction: Estimate soil behavior from test measurements
• Infrastructure monitoring: Detect anomalies from sensor data patterns
This demo shows how gradients flow backward through layers and how weights are updated to reduce prediction error.
How to Use This Demo:
• Set input values and target output to define the learning task
• Adjust learning rate to control the step size for weight updates
• Click "Step Optimization" to perform one iteration of backpropagation
• Use "Reset Weights" to start with new random weights
• Hover over table entries to highlight corresponding network connections
• Watch the loss decrease over multiple optimization steps
Network Configuration:
• Architecture: 2 inputs → 3 hidden (ReLU) → 1 output (linear)
• Loss function: Mean Squared Error (MSE)
• Optimization: Gradient descent with adjustable learning rate
Network Visualization:
• Blue nodes: Input layer
• Purple nodes: Hidden layer with ReLU activation
• Green node: Output layer with linear activation
• Node values: Show current activations after forward pass
• Connection thickness: Represents weight magnitudes
Parameter Tables:
• Weights table: Current weight values for all connections
• Gradients table: Computed gradients for each weight (used for updates)
• Color coding: Positive values in green, negative in red
• Hover highlighting: Shows which connection each table entry represents
Loss History Graph:
• Tracks MSE loss over optimization steps
• Shows learning progress as loss decreases
• Illustrates effect of learning rate on convergence
Mathematical Foundation:
Backpropagation uses the chain rule to compute gradients layer by layer:
$$\frac{\partial L}{\partial w_{ij}} = \frac{\partial L}{\partial a_j} \cdot \frac{\partial a_j}{\partial z_j} \cdot \frac{\partial z_j}{\partial w_{ij}}$$
Where $L$ is the loss, $w_{ij}$ is a weight, $z_j$ is pre-activation, and $a_j$ is post-activation.
Forward Pass:
• Hidden layer: $z_h = W_1 x + b_1$, $a_h = \text{ReLU}(z_h)$
• Output layer: $z_o = W_2 a_h + b_2$, $y = z_o$ (linear)
• Loss: $L = \frac{1}{2}(y - t)^2$ where $t$ is target
Backward Pass:
• Output gradients: $\frac{\partial L}{\partial W_2} = (y-t) \cdot a_h$
• Hidden gradients: $\frac{\partial L}{\partial W_1} = \frac{\partial L}{\partial a_h} \cdot \frac{\partial a_h}{\partial z_h} \cdot x$
• Weight updates: $w \leftarrow w - \eta \frac{\partial L}{\partial w}$
Learning Rate Guidelines:
• Start with learning rates around 0.01-0.1
• Too high: Loss may oscillate or diverge
• Too low: Very slow convergence
• Observe loss curve to assess if rate is appropriate
Training Observations:
• Watch how gradients propagate backward through layers
• Notice that output layer gradients are typically larger
• Hidden layer gradients depend on both forward activations and backward error signals
• ReLU activation sets gradients to zero for negative inputs (dead neurons)
Practical Insights:
• Multiple steps usually needed to reach good solutions
• Different input-target pairs will produce different gradient patterns
• Weight initialization affects convergence speed and final solution
• Real applications use mini-batches and advanced optimizers (Adam, RMSprop)
Engineering Applications:
• Start simple: Small networks often sufficient for many civil engineering problems
• Monitor training: Always track loss to ensure learning is progressing
• Validation data: Use separate data to check generalization
• Feature engineering: Good input features often more important than complex architectures