
2.2 - Comparing Absolute Error (L1) vs. Mean Squared Error (L2)
Introduction
The choice of error function is crucial in machine learning. This demo will help you visually understand the differences between Mean Absolute Error (L1) and Mean Squared Error (L2), and how each responds differently to outliers.
π’
Activity
Error Comparison: L1 vs. L2
Scenario:
A logistics company needs to predict delivery time based on distance, traffic, and weather. Some routes have incidents or delays that can skew model evaluation.
How to Explore It
- Toggle between L1 and L2: Compare how each metric penalizes errors differently.
- Adjust the data: See how the metrics respond when you modify data points.
- Interpret the differences: L2 grows faster with large errors; L1 stays stable with outliers.
What to watch for:
L1 is more robust to outliers (exceptional cases), while L2 penalizes large errors much more. Watch how the error metrics change when you introduce extreme values.
Generate data to compare how L1 and L2 behave. Try βnormalβ data and data with outliers, run multiple times, and watch how the fitted line and the average error react.
| Method | Average error |
|---|---|
| L1 (Absolute error) | - |
| L2 (Squared error) | - |
Fundamental Concepts
Key Differences
- Mean Absolute Error (L1)
- Sums absolute differences. More robust to outliers.
- Mean Squared Error (L2)
- Sums squared differences. Heavily penalizes large errors.