4.2 A Spectrum of Possibilities: K-NN for Multiclass Classification

Introduction

The K-Nearest Neighbors (K-NN) algorithm is one of the simplest and most effective machine learning algorithms for classification tasks. Its logic is intuitive: β€œtell me who you hang out with, and I’ll tell you who you are.”

🏒

Activity

Interactive K-NN Classifier

What to watch for: The K-NN algorithm classifies new data based on its proximity to known data points. It is a form of lazy learning that does not build an explicit model but instead uses the training data directly.

Interactive Demonstration

K-NN Classifier: Cat or Dog?

How to Explore It

  1. Select the value of K: Choose how many nearby neighbors to consider for classification. A small K is more sensitive to noise, while a large K smooths decisions.
  2. Observe the distances: Click any point to see how distances are computed to all training points and which are the K nearest neighbors.
  3. Analyze the classification: The point is classified according to the majority class among its K nearest neighbors. Try different K values to see how the classification changes.
  4. Use the Interaction Mode buttons: Select how to interact with the graph. With 🐱 Add Cats and 🐢 Add Dogs you can click the canvas to add new training points for each class. With πŸ” Classify, clicking the canvas adds an unknown point and the algorithm automatically determines which class it belongs to based on its K nearest neighbors.

Model Configuration

🎯 Classification Space

Typical Cats
High agility
Low sociability
Mixed Zone
High agility
High sociability
Neutral Zone
Low agility
Low sociability
Typical Dogs
Low agility
High sociability

🏷️ Point Legend

🐱 Cats (Training)
🐢 Dogs (Training)
πŸ” Point to classify
πŸ“ Nearest neighbors

πŸ“ Distance Metrics

Euclidean: "Straight-line" distance between points
Manhattan: "City block" distance (|Ξ”X| + |Ξ”Y|)

Dataset Statistics

Total Points 0
Cats 0 Training feline cases
Dogs 0 Training canine cases
Current K 3 Neighbors considered

Core Concepts

How Does K-NN Work?

K-NN is a lazy learning algorithm β€” it does not build an explicit model during training. Instead, when it needs to classify a new point:

  1. Computes the distance between the new point and all training points.
  2. Selects the K nearest neighbors based on that distance.
  3. Assigns the class that is most common among these K neighbors (majority voting).
Key Considerations
  • Value of K: A small K makes the model sensitive to noise, while a large K overly smooths the decision boundaries.
  • Distance metric: Euclidean distance is common, but other metrics may be better suited depending on the problem.
  • Normalization: It’s important to normalize features when they have different scales.