π K-NN and Decision Trees
These references reinforce the chapterβs storyline on interpretable classifiers by expanding the math behind proximity-based methods, surfacing tuning tips for tree-based models, and curating case studies where clarity is critical for adoption.
Table of Contents
- K-Nearest Neighbors Foundations
- Decision Trees
- Distance Metrics & Feature Scaling
- Case Studies and Adoption
- Additional Resources
1. K-Nearest Neighbors Foundations
| Resource | Type | Notes | Access |
|---|---|---|---|
| Nearest Neighbors β scikit-learn | Documentation | Covers algorithm choices (KNeighborsClassifier, RadiusNeighbors), distance metrics, and tree-based acceleration. | https://scikit-learn.org/stable/modules/neighbors.html |
| K-nearest neighbors algorithm β Wikipedia | Reference article | Historical origins, mathematical formulation, and common variants (weighted, radius-based). | https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm |
| K-Nearest Neighbors β IBM Think | Overview | Business-friendly explanation of strengths, limitations, and data preparation requirements. | https://www.ibm.com/topics/knn |
| Machine Learning Basics with K-NN | Tutorial | Step-by-step example highlighting decision boundaries, hyperparameter selection, and scaling. | https://towardsdatascience.com/machine-learning-basics-with-the-k-nearest-neighbors-algorithm-6a6e71d01761 |
| Bishop, C. M. (2006). Pattern Recognition and Machine Learning | Textbook | Chapter 2 presents nearest-neighbor methods and links them to Bayesian decision theory. | https://www.microsoft.com/en-us/research/publication/pattern-recognition-machine-learning/ |
2. Decision Trees
| Resource | Type | Notes | Access |
|---|---|---|---|
| Decision Trees β scikit-learn | Documentation | Explains CART splits, impurity measures, cost-complexity pruning, and feature importance. | https://scikit-learn.org/stable/modules/tree.html |
| Decision tree β Wikipedia | Reference article | Surveys ID3, C4.5, CART, and includes algorithmic pseudocode and applications. | https://en.wikipedia.org/wiki/Decision_tree |
| Decision Tree Algorithm Explained | Tutorial | Visual walkthrough of split criteria, pruning strategies, and implementation tips. | https://www.analyticsvidhya.com/blog/2021/08/decision-tree-algorithm/ |
| Decision Region Visualizer β MLxtend | Tool | Ready-to-use plotting utility for showcasing K-NN vs. tree boundaries in 2D. | http://rasbt.github.io/mlxtend/user_guide/plotting/plot_decision_regions/ |
| Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1984). Classification and Regression Trees | Monograph | Definitive treatment of tree induction, pruning, and interpretability. | https://doi.org/10.1201/9781315139470 |
3. Distance Metrics & Feature Scaling
| Resource | Focus | Access |
|---|---|---|
| Pairwise distances β scikit-learn | Catalog of supported metrics (Euclidean, cosine, Mahalanobis) with usage notes. | https://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise_distances.html |
| Feature scaling and normalization β scikit-learn | Highlights StandardScaler, MinMaxScaler, and pipelines for consistent preprocessing. | https://scikit-learn.org/stable/modules/preprocessing.html#preprocessing-scaler |
| The Curse of Dimensionality β StatQuest | Animated explainer detailing why high dimensionality degrades K-NN performance. | https://www.youtube.com/watch?v=mDaxrKL2iWw |
| Dimensionality reduction β Wikipedia | Overview of PCA, ICA, and manifold learning to mitigate distance concentration. | https://en.wikipedia.org/wiki/Dimensionality_reduction |
4. Case Studies and Adoption
| Resource | Highlight | Access |
|---|---|---|
| K-NN for cancer classification | Demonstrates benign vs. malignant tumor detection with K-NN on the WBCD dataset. | https://link.springer.com/article/10.1007/s10916-012-9833-7 |
| Decision trees in medical diagnosis | Evaluates tree-based decision support for clinical protocols and transparency. | https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/1472-6947-9-11 |
| Clinical decision support systems β WHO | WHO guidance on rule-based systems echoing tree logic in reproductive health. | https://www.who.int/publications/i/item/clinical-decision-support-systems-for-family-planning |
| Machine Learning for Healthcare β MIT OCW | Graduate-level lectures featuring K-NN and tree applications with medical datasets. | https://ocw.mit.edu/courses/hst-953j-machine-learning-for-healthcare-spring-2019/ |
5. Additional Resources
| Resource | Focus | Access |
|---|---|---|
| scikit-learn example gallery β classification | Curated notebooks comparing K-NN, trees, ensembles, and evaluation visuals. | https://scikit-learn.org/stable/auto_examples/index.html#classification |
| Google ML Crash Course | Interactive modules covering supervised learning best practices. | https://developers.google.com/machine-learning/crash-course |
| Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning | Chapters 13β14 deepen K-NN and tree theory with medical examples. | https://hastie.su.domains/ElemStatLearn/ |
Note: All links were re-checked in October 2025. For licensed material, leverage institutional access or open repositories.