Backpropagation Algorithm
Backpropagation is the key algorithm that enables neural networks to learn from data. It efficiently computes gradients by propagating errors backward through the network, allowing us to update weights to minimize prediction errors.

In civil engineering applications, backpropagation trains networks to: • Structural analysis: Learn relationships between loads and structural responses
Material modeling: Predict material properties from composition and processing parameters
Geotechnical prediction: Estimate soil behavior from test measurements
Infrastructure monitoring: Detect anomalies from sensor data patterns

This demo shows how gradients flow backward through layers and how weights are updated to reduce prediction error.
How to Use This Demo:
• Set input values and target output to define the learning task
• Adjust learning rate to control the step size for weight updates
• Click "Step Optimization" to perform one iteration of backpropagation
• Use "Reset Weights" to start with new random weights
• Hover over table entries to highlight corresponding network connections
• Watch the loss decrease over multiple optimization steps

Network Configuration:
• Architecture: 2 inputs → 3 hidden (ReLU) → 1 output (linear)
• Loss function: Mean Squared Error (MSE)
• Optimization: Gradient descent with adjustable learning rate
Network Visualization:
Blue nodes: Input layer
Purple nodes: Hidden layer with ReLU activation
Green node: Output layer with linear activation
Node values: Show current activations after forward pass
Connection thickness: Represents weight magnitudes

Parameter Tables:
Weights table: Current weight values for all connections
Gradients table: Computed gradients for each weight (used for updates)
Color coding: Positive values in green, negative in red
Hover highlighting: Shows which connection each table entry represents

Loss History Graph:
• Tracks MSE loss over optimization steps
• Shows learning progress as loss decreases
• Illustrates effect of learning rate on convergence
Mathematical Foundation:
Backpropagation uses the chain rule to compute gradients layer by layer:

$$\frac{\partial L}{\partial w_{ij}} = \frac{\partial L}{\partial a_j} \cdot \frac{\partial a_j}{\partial z_j} \cdot \frac{\partial z_j}{\partial w_{ij}}$$

Where $L$ is the loss, $w_{ij}$ is a weight, $z_j$ is pre-activation, and $a_j$ is post-activation.

Forward Pass:
• Hidden layer: $z_h = W_1 x + b_1$, $a_h = \text{ReLU}(z_h)$
• Output layer: $z_o = W_2 a_h + b_2$, $y = z_o$ (linear)
• Loss: $L = \frac{1}{2}(y - t)^2$ where $t$ is target

Backward Pass:
• Output gradients: $\frac{\partial L}{\partial W_2} = (y-t) \cdot a_h$
• Hidden gradients: $\frac{\partial L}{\partial W_1} = \frac{\partial L}{\partial a_h} \cdot \frac{\partial a_h}{\partial z_h} \cdot x$
• Weight updates: $w \leftarrow w - \eta \frac{\partial L}{\partial w}$
Learning Rate Guidelines:
• Start with learning rates around 0.01-0.1
• Too high: Loss may oscillate or diverge
• Too low: Very slow convergence
• Observe loss curve to assess if rate is appropriate

Training Observations:
• Watch how gradients propagate backward through layers
• Notice that output layer gradients are typically larger
• Hidden layer gradients depend on both forward activations and backward error signals
• ReLU activation sets gradients to zero for negative inputs (dead neurons)

Practical Insights:
• Multiple steps usually needed to reach good solutions
• Different input-target pairs will produce different gradient patterns
• Weight initialization affects convergence speed and final solution
• Real applications use mini-batches and advanced optimizers (Adam, RMSprop)

Engineering Applications:
• Start simple: Small networks often sufficient for many civil engineering problems
• Monitor training: Always track loss to ensure learning is progressing
• Validation data: Use separate data to check generalization
• Feature engineering: Good input features often more important than complex architectures
2.4455
Step: 0
Input
1.00
0.50
Hidden
Output
-0.21

Network Parameters

Weights

Input → Hidden Weights
From I1
From I2
Hidden → Output
→ H1: -0.551
→ H2: -0.683
→ H3: 0.349
→ H1: -0.823
→ H2: -0.616
→ H3: -0.247
H1 →: -1.005
H2 →: -0.733
H3 →: -0.939

Gradients

Input → Hidden Gradients
From I1
From I2
Hidden → Output
→ H1: 0.000
→ H2: 0.000
→ H3: 0.000
→ H1: 0.000
→ H2: 0.000
→ H3: 0.000
H1 →: 0.000
H2 →: 0.000
H3 →: 0.000