πŸ“š K-NN and Decision Trees

These references reinforce the chapter’s storyline on interpretable classifiers by expanding the math behind proximity-based methods, surfacing tuning tips for tree-based models, and curating case studies where clarity is critical for adoption.


Table of Contents

  1. K-Nearest Neighbors Foundations
  2. Decision Trees
  3. Distance Metrics & Feature Scaling
  4. Case Studies and Adoption
  5. Additional Resources

1. K-Nearest Neighbors Foundations

ResourceTypeNotesAccess
Nearest Neighbors β€” scikit-learnDocumentationCovers algorithm choices (KNeighborsClassifier, RadiusNeighbors), distance metrics, and tree-based acceleration.https://scikit-learn.org/stable/modules/neighbors.html
K-nearest neighbors algorithm β€” WikipediaReference articleHistorical origins, mathematical formulation, and common variants (weighted, radius-based).https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm
K-Nearest Neighbors β€” IBM ThinkOverviewBusiness-friendly explanation of strengths, limitations, and data preparation requirements.https://www.ibm.com/topics/knn
Machine Learning Basics with K-NNTutorialStep-by-step example highlighting decision boundaries, hyperparameter selection, and scaling.https://towardsdatascience.com/machine-learning-basics-with-the-k-nearest-neighbors-algorithm-6a6e71d01761
Bishop, C. M. (2006). Pattern Recognition and Machine LearningTextbookChapter 2 presents nearest-neighbor methods and links them to Bayesian decision theory.https://www.microsoft.com/en-us/research/publication/pattern-recognition-machine-learning/

2. Decision Trees

ResourceTypeNotesAccess
Decision Trees β€” scikit-learnDocumentationExplains CART splits, impurity measures, cost-complexity pruning, and feature importance.https://scikit-learn.org/stable/modules/tree.html
Decision tree β€” WikipediaReference articleSurveys ID3, C4.5, CART, and includes algorithmic pseudocode and applications.https://en.wikipedia.org/wiki/Decision_tree
Decision Tree Algorithm ExplainedTutorialVisual walkthrough of split criteria, pruning strategies, and implementation tips.https://www.analyticsvidhya.com/blog/2021/08/decision-tree-algorithm/
Decision Region Visualizer β€” MLxtendToolReady-to-use plotting utility for showcasing K-NN vs. tree boundaries in 2D.http://rasbt.github.io/mlxtend/user_guide/plotting/plot_decision_regions/
Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1984). Classification and Regression TreesMonographDefinitive treatment of tree induction, pruning, and interpretability.https://doi.org/10.1201/9781315139470

3. Distance Metrics & Feature Scaling

ResourceFocusAccess
Pairwise distances β€” scikit-learnCatalog of supported metrics (Euclidean, cosine, Mahalanobis) with usage notes.https://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise_distances.html
Feature scaling and normalization β€” scikit-learnHighlights StandardScaler, MinMaxScaler, and pipelines for consistent preprocessing.https://scikit-learn.org/stable/modules/preprocessing.html#preprocessing-scaler
The Curse of Dimensionality β€” StatQuestAnimated explainer detailing why high dimensionality degrades K-NN performance.https://www.youtube.com/watch?v=mDaxrKL2iWw
Dimensionality reduction β€” WikipediaOverview of PCA, ICA, and manifold learning to mitigate distance concentration.https://en.wikipedia.org/wiki/Dimensionality_reduction

4. Case Studies and Adoption

ResourceHighlightAccess
K-NN for cancer classificationDemonstrates benign vs. malignant tumor detection with K-NN on the WBCD dataset.https://link.springer.com/article/10.1007/s10916-012-9833-7
Decision trees in medical diagnosisEvaluates tree-based decision support for clinical protocols and transparency.https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/1472-6947-9-11
Clinical decision support systems β€” WHOWHO guidance on rule-based systems echoing tree logic in reproductive health.https://www.who.int/publications/i/item/clinical-decision-support-systems-for-family-planning
Machine Learning for Healthcare β€” MIT OCWGraduate-level lectures featuring K-NN and tree applications with medical datasets.https://ocw.mit.edu/courses/hst-953j-machine-learning-for-healthcare-spring-2019/

5. Additional Resources

ResourceFocusAccess
scikit-learn example gallery β€” classificationCurated notebooks comparing K-NN, trees, ensembles, and evaluation visuals.https://scikit-learn.org/stable/auto_examples/index.html#classification
Google ML Crash CourseInteractive modules covering supervised learning best practices.https://developers.google.com/machine-learning/crash-course
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical LearningChapters 13–14 deepen K-NN and tree theory with medical examples.https://hastie.su.domains/ElemStatLearn/

Note: All links were re-checked in October 2025. For licensed material, leverage institutional access or open repositories.