| Epoch |
|---|
| Errors |
| $\beta_0$ (bias) |
| $\beta_1$ |
| $\beta_2$ |
| Updates |
The Perceptron Algorithm is a supervised learning algorithm for binary linear classifiers. Given input $\mathbf{x} = (x_1, x_2)$ and weights $\boldsymbol{\beta} = (\beta_1, \beta_2)$ with bias $\beta_0$, the perceptron computes a weighted sum:
$$\mu = \beta_0 + \beta_1 x_1 + \beta_2 x_2 = \beta_0 + \boldsymbol{\beta}^T\mathbf{x}$$
The activation function applies a step function to produce binary output:
$$h(\mu) = \begin{cases} 1 & \text{if } \mu \geq 0 \\ 0 & \text{if } \mu < 0 \end{cases}$$
The perceptron geometrically defines a decision boundary (hyperplane) that separates the input space. Points on one side are classified as class 1, points on the other as class 0. The equation $\beta_0 + \beta_1 x_1 + \beta_2 x_2 = 0$ defines this boundary line.
The Learning Rule: For each training example $(\mathbf{x}^{(i)}, y^{(i)})$, the perceptron computes the prediction $\hat{y}^{(i)} = h(\mu^{(i)})$ and the error $e^{(i)} = y^{(i)} - \hat{y}^{(i)}$. If the prediction is correct ($e = 0$), no update is made. If incorrect ($e = \pm 1$), the weights are updated:
$$\beta_0 \leftarrow \beta_0 + \eta \cdot e^{(i)}$$
$$\boldsymbol{\beta} \leftarrow \boldsymbol{\beta} + \eta \cdot e^{(i)} \cdot \mathbf{x}^{(i)}$$
where $\eta > 0$ is the learning rate controlling step size. This update rule geometrically rotates the decision boundary toward correctly classifying the misclassified point.
Training proceeds in epochs: each epoch processes all training examples once. The algorithm converges when an entire epoch produces zero errors. The Perceptron Convergence Theorem guarantees convergence in finite steps if the data is linearly separable. For non-separable data, the perceptron will never converge and continues updating indefinitely.
Developed by Kevin Yu & Panagiotis Angeloudis