626 — kNN

Step 1 of 8

Introduction — What is kNN?

kNN (k-Nearest Neighbours) is a simple classification algorithm. It predicts the class of a new data point by looking at the k closest labelled points and taking a majority vote.

Check understanding

What does kNN use to classify new data?

Random selection
Nearest labelled neighbours
Mathematical formula

Step 2 of 8

The dataset — two clusters

Our dataset has two clear clusters of points. Each point is labelled as either Class A or Class B, forming distinct groups in the space.

Check understanding

What pattern do you see in the dataset?

Random scatter
Two distinct clusters
Single group

Step 3 of 8

Classifying with k = 1

With k = 1, the algorithm looks at only the single nearest neighbour. The new point gets classified as whatever class that nearest neighbour belongs to.

Check understanding

How many neighbours does k = 1 consider?

All neighbours
Just one
Three neighbours

Step 4 of 8

Classifying with k = 3

With k = 3, the algorithm examines the three nearest neighbours and uses majority voting. If two are Class A and one is Class B, the prediction is Class A.

Check understanding

How does k = 3 make its prediction?

Uses the closest point only
Majority vote of 3 neighbours
Averages all points

Step 5 of 8

Classifying with k = 7

With k = 7, we consider seven nearest neighbours. This larger k value makes the classification more stable and less sensitive to individual outlier points.

Check understanding

What advantage does a larger k provide?

Faster computation
More stable predictions
Worse accuracy

Step 6 of 8

Decision boundary with k = 1

The decision boundary shows which regions would be classified as A or B. With k = 1, the boundary is very jagged because it reacts to every single nearby point.

Check understanding

Why is the k = 1 boundary jagged?

It uses all points
It reacts to individual points
It ignores close points

Step 7 of 8

Decision boundary with k = 7

With k = 7, the boundary becomes much smoother. Instead of following every local variation, it captures the general separation between the two clusters.

Check understanding

What does a smooth boundary indicate?

Over-fitting to noise
General pattern recognition
Random classification

Step 8 of 8

Comparing k = 1 vs k = 7

Side-by-side comparison shows the key trade-off: small k is sensitive to local details (can overfit), while large k focuses on broader patterns (more generalizable). The transition zone between clusters shows this most clearly.

Check understanding

In the ambiguous middle region, which k is more stable?

k = 1 (more detail)
k = 7 (averages neighbors)
Both are equal