Step 2 of 16
Why regression will not work
A straight regression line can output any value, including numbers below 0 or above 1. Those numbers cannot be interpreted as probabilities.
Check understanding
Why is plain regression unsuitable here?
- It predicts outside 0–1
- It is too slow
- It cannot use gradients
Step 9 of 16
Training the model
After many updates, the sigmoid aligns with the data: low x predicts fail (0), high x predicts pass (1).
Step 12 of 16
Probability to class
To classify, we threshold the probability: if p ≥ 0.5, predict class 1; otherwise predict class 0.
Check understanding
How do we convert probability to a class?
- Threshold at 0.5
- Always choose 1
- Always choose 0
Step 14 of 16
Poor-fit dataset
On an XOR-style dataset, labels alternate along x. A single sigmoid can only make one boundary, so it misclassifies alternating sections.
Step 15 of 16
Why it fails
Because logistic regression uses one boundary, it cannot follow multiple flips between 0 and 1. Misclassifications stay even after training.
Check understanding
Why does one sigmoid fail on XOR?
- Only one boundary
- Too many parameters
- No gradient available