2D Classification

Classification problems involve predicting which category or class an input belongs to. In 2D binary classification, we have two features (X₁ and X₂) and need to separate data points into two classes.

In civil engineering, this could represent:
• Soil stability: Classifying soil samples as "stable" or "unstable" based on moisture content and density
• Structural safety: Categorizing bridge components as "safe" or "requires inspection" based on load and age
• Traffic flow: Predicting "congested" vs "free-flowing" traffic based on time of day and weather conditions

The goal is to find a decision boundary that best separates the two classes, minimizing classification errors.

• Select boundary type (linear/quadratic), adjust weight parameters, and use "Find Optimal Solution" to optimize
• Load sample datasets or add points by clicking the plot—use "Click adds" boxes to choose class (red circles or teal squares)
• Observe three boundary lines: red solid (high confidence Class 0), magenta dotted (decision boundary at 50%), teal solid (high confidence Class 1)
• Background shading shows probability surface - darker shading indicates higher model confidence

When does a linear boundary suffice, and when do you need a quadratic boundary? Try the sample datasets to explore.

• Red circles: Class 0 data points
• Teal squares: Class 1 data points
• Red solid line: High confidence Class 0 threshold (25% probability)
• Magenta dotted line: Decision boundary (50% probability - separates the two classes)
• Teal solid line: High confidence Class 1 threshold (75% probability)
• Background shading: Probability surface (faint red = likely Class 0, faint teal = likely Class 1, intensity shows confidence)
• Linear boundary: Uses θ₀, θ₁, θ₂ → Straight line boundary
• Quadratic boundary: Uses θ₀, θ₁, θ₂, θ₁₁, θ₂₂, θ₁₂ → Curved boundaries (ellipses, parabolas)

• Start with linear boundaries for linearly separable data
• Use quadratic boundaries when data forms curved patterns
• The "Find Optimal Solution" button uses gradient descent to minimize loss
• Accuracy: Percentage of correctly classified points (higher is better)
• Loss: Cross-entropy loss measuring prediction confidence (lower is better)
• Perfect separation: Loss approaches 0, accuracy approaches 100%
• Overfitting risk: Complex boundaries may not generalize to new data
• Use simpler boundaries (linear) when interpretability is important
• Complex boundaries may be needed for real-world civil engineering data
• Always validate on separate test data to check generalization

f (x) = σ (0 + 0.1 x_{1} + 0.1 x_{2})

Click adds:

Class 0

Class 1

Sample Datasets:

Decision Boundary Type

📏

Linear

〰️

Quadratic

Accuracy = 75 %

Loss = 0.599

θ_{0}

θ_{1}

(

x_{1}

)

0.1

θ_{2}

(

x_{2}

)

0.1

θ_{11}

(

x_{1}^{2}

)

θ_{22}

(

x_{2}^{2}

)

θ_{12}

(

x_{1} x_{2}

)

Mathematical Foundations

Logistic Regression in 2D uses the sigmoid function to map any decision function to a probability between 0 and 1: $P (y = 1 ∣ x) = σ (f (x)) = \frac{1}{1 + e^{- f (x)}}$
where $f (x)$ is the decision function that depends on the boundary type.

Linear Decision Boundary: The simplest case uses a linear decision function: $z = θ_{0} + θ_{1} x_{1} + θ_{2} x_{2}$
The decision boundary occurs where $z = 0$ (where probability equals 0.5), which forms a straight line in 2D space. Points on one side of this line are classified as class 1, points on the other side as class 0. The parameters $θ_{1}$ and $θ_{2}$ control the orientation (slope) of the boundary, while the bias $θ_{0}$ controls its position (intercept).

Quadratic Decision Boundary: For non-linearly separable data, we can use quadratic terms to create curved boundaries: $z = θ_{0} + θ_{1} x_{1} + θ_{2} x_{2} + θ_{11} x_{1}^{2} + θ_{22} x_{2}^{2} + θ_{12} x_{1} x_{2}$
This allows the decision boundary to form ellipses, parabolas, or hyperbolas depending on the learned parameters. The quadratic terms $θ_{11} x_{1}^{2}$ and $θ_{22} x_{2}^{2}$ create curvature along each axis, while the interaction term $θ_{12} x_{1} x_{2}$ allows for rotated or skewed boundaries.

Training and Optimization: As with 1D classification, we train using cross-entropy loss: $J (\boldsymbol θ) = - \frac{1}{m} \sum_{i = 1}^{m} [y^{(i)} \log (h (x^{(i)})) + (1 - y^{(i)}) \log (1 - h (x^{(i)}))]$
and optimize using gradient descent. The "Find Optimal Solution" button in this demo uses gradient descent to minimize cross-entropy loss for the selected boundary type.

Practical guidance: Start with linear boundaries for linearly separable data—they are simpler, more interpretable, and less prone to overfitting. Use quadratic (or higher-order) boundaries when data exhibits curved separation patterns that cannot be captured by a straight line. Always validate on separate test data to ensure the model generalizes beyond the training set. For engineering applications where interpretability matters, prefer simpler (linear) boundaries when performance is comparable.

Developed by Kevin Yu & Panagiotis Angeloudis