4.2 A Spectrum of Possibilities: K-NN for Multiclass Classification

Introduction

The K-Nearest Neighbors (K-NN) algorithm is one of the simplest and most effective machine learning algorithms for classification tasks. Its logic is intuitive: “tell me who you hang out with, and I’ll tell you who you are.”

🏢

Activity

Interactive K-NN Classifier

What to watch for: The K-NN algorithm classifies new data based on its proximity to known data points. It is a form of lazy learning that does not build an explicit model but instead uses the training data directly.

Interactive Demonstration

K-NN Classifier: Cat or Dog?

How to Explore It

Select the value of K: Choose how many nearby neighbors to consider for classification. A small K is more sensitive to noise, while a large K smooths decisions.
Observe the distances: Click any point to see how distances are computed to all training points and which are the K nearest neighbors.
Analyze the classification: The point is classified according to the majority class among its K nearest neighbors. Try different K values to see how the classification changes.
Use the Interaction Mode buttons: Select how to interact with the graph. With 🐱 Add Cats and 🐶 Add Dogs you can click the canvas to add new training points for each class. With 🔍 Classify, clicking the canvas adds an unknown point and the algorithm automatically determines which class it belongs to based on its K nearest neighbors.

Model Configuration

🗺️ Show Boundaries

📏 Show Distances

K Value: 3

Distance Metric

Interaction Mode

🎯 Classification Space

Typical Cats
High agility
Low sociability

Mixed Zone
High agility
High sociability

Neutral Zone
Low agility
Low sociability

Typical Dogs
Low agility
High sociability

🏷️ Point Legend

🐱 Cats (Training)

🐶 Dogs (Training)

🔍 Point to classify

📍 Nearest neighbors

📏 Distance Metrics

Euclidean: "Straight-line" distance between points

Manhattan: "City block" distance (|ΔX| + |ΔY|)

Dataset Statistics

Total Points 0

Cats 0 Training feline cases

Dogs 0 Training canine cases

Current K 3 Neighbors considered

Core Concepts

How Does K-NN Work?

K-NN is a lazy learning algorithm — it does not build an explicit model during training. Instead, when it needs to classify a new point:

Computes the distance between the new point and all training points.
Selects the K nearest neighbors based on that distance.
Assigns the class that is most common among these K neighbors (majority voting).

Key Considerations

Value of K: A small K makes the model sensitive to noise, while a large K overly smooths the decision boundaries.
Distance metric: Euclidean distance is common, but other metrics may be better suited depending on the problem.
Normalization: It’s important to normalize features when they have different scales.