Logistic Regression in 2D uses the sigmoid function to map any decision function to a probability between 0 and 1:
where
Linear Decision Boundary: The simplest case uses a linear decision function:
The decision boundary occurs where
Quadratic Decision Boundary: For non-linearly separable data, we can use quadratic terms to create curved boundaries:
This allows the decision boundary to form ellipses, parabolas, or hyperbolas depending on the learned parameters. The quadratic terms
Training and Optimization: As with 1D classification, we train using cross-entropy loss:
and optimize using gradient descent. The "Find Optimal Solution" button in this demo uses gradient descent to minimize cross-entropy loss for the selected boundary type.
Practical guidance: Start with linear boundaries for linearly separable data—they are simpler, more interpretable, and less prone to overfitting. Use quadratic (or higher-order) boundaries when data exhibits curved separation patterns that cannot be captured by a straight line. Always validate on separate test data to ensure the model generalizes beyond the training set. For engineering applications where interpretability matters, prefer simpler (linear) boundaries when performance is comparable.
Developed by Kevin Yu & Panagiotis Angeloudis