Gradient descent moves downhill on the loss surface, but regularisation reshapes that surface to keep coefficients in check. L2 (ridge) adds circular penalty contours that gently pull  toward the origin, while L1 (lasso) forms diamond-shaped contours with sharp corners on the axes. These distinct geometries change the path the optimiser follows: ridge shrinks parameters smoothly, whereas lasso can snap one coefficient to zero when the trajectory hits a diamond corner. In civil engineering models, this geometric perspective clarifies why regularisation stabilises predictions without relying on noisy measurements.
                    
                    
                        Use the tabs below to switch between ridge (L2) and lasso (L1) regularisation. Adjust the λ slider to set penalty strength—values near zero mimic unregularised training, while larger values tighten the constraint. Press Train to animate gradient descent from the initial point. Watch the contour plot: coloured contours show training loss, grey shapes mark the regularisation penalty, arrows trace each optimisation step, and the parameter readout updates after every iteration. Compare the train and total loss displays to see how the penalty term contributes to the overall objective.
                    
                    
                        Legend
                        • Loss contours: green → yellow → red indicate increasing training loss.
                        • Regularisation contours: grey circles for L2, grey diamonds for L1.
                        • Trail dots: small magenta markers for each gradient step; large blue dot is the starting point, large red dot is the final position.
                        • Arrows: magenta arrows connect successive steps and reveal the descent direction.
                        • Optimal markers: the green point highlights the unregularised optimum for reference when λ = 0.
                    
                    
                        Experiment with λ to see how much regularisation you need. λ = 0 recovers pure least squares and the trajectory heads straight for the MSE minimum. Small λ values nudge coefficients toward the origin without drastically changing the path. Large λ values dominate the update: ridge pulls both coefficients inward together, while lasso may zero one coefficient entirely, producing sparse solutions. Compare the final  values between tabs to understand when an L1 or L2 penalty is preferable for civil engineering feature sets.