3.3 The Critical Decision: Risk Manager

Introduction

Fine-tuning decision thresholds is essential for the real-world performance of predictive models. Even if a model can flag high-risk cases automatically, the impact ultimately depends on choosing the point where actions and operational costs stay in balance. The Risk Manager simulation lets you experiment with that trade-off and see how the cost curve responds in real time.

🏢

Activity

Risk Manager: Cost-Aware Threshold Tuning

Scenario: You manage a product that wants to reduce churn. A model estimates the probability that each user will drop off, but you must decide the threshold that triggers a reminder or retention message. Setting the threshold too low floods the team with unnecessary actions, whereas setting it too high allows avoidable churn. The goal is to minimise the total operational cost by finding the optimal balance between reminders and losses.

How to Explore It

  1. Inspect the probabilities: Review how the simulated cases distribute across risk levels.
  2. Tune the decision threshold: Move the slider and watch which cases receive reminders versus which are left alone. Observe how true positives and false negatives change.
  3. Compare total cost: Track reminder spending (€5 each) versus loss penalties (€25 each) to locate the minimum operational cost.
What to watch for: Choosing the right threshold converts predicted probabilities into binary actions. Lower thresholds favour sensitivity (fewer missed high-risk cases) while higher thresholds favour specificity (fewer unnecessary reminders). The optimal operating point minimises the combined expense rather than maximising accuracy alone.

Simulator: The Risk Manager

Your mission is to find the perfect decision threshold for an AI model that predicts whether someone will miss a booked appointment. The goal is to minimize the service’s total cost.

How does this work?

The AI Model (simulated)

Imagine a Machine Learning model that has studied thousands of historical records. For each of the 100 bookings below, it estimates the probability of a no-show. The model is good, but not perfect: bookings that actually no-show tend to receive higher probabilities.

Your Role as Manager

The model does not make the final decision—you do! Use the slider to define the risk threshold. If a case’s probability exceeds the threshold, the system marks it as "likely no-show" and sends a reminder (cost: €5).

Game Objective

Find the sweet spot. A very low threshold means many unnecessary reminders. A very high threshold means losing bookings that weren’t nudged (cost: €25 per missed appointment). Watch the "Total Cost" and look for the minimum value!

This is the model's action threshold. If the probability that a user will churn exceeds this threshold, we will send them a reminder (prediction: Will churn). Otherwise, we will assume they will stay (prediction: Will stay). The displayed number represents the confidence percentage that the user will churn.

Adjust Decision Threshold: 50%

Prediction: Will Churn
Prediction: Will Stay
Reality: Churned
0
0
Reality: Stayed
0
0

Economic Outcomes:

Churned Users (FN): 0

Reminder Cost (TP+FP): 0€

Total Cost: 0€

Formula: (No. Reminders × 5€) + (No. Churned Users × 25€)

Total Cost by Threshold

Core Concepts

Threshold Tuning

Choosing the right threshold converts predicted probabilities into binary actions. Lower thresholds favour sensitivity (fewer missed high-risk cases) while higher thresholds favour specificity (fewer unnecessary reminders).

Cost-Aware Evaluation

This simulator balances two concrete costs: €5 per reminder that is sent and €25 for each loss that was not prevented in time. The optimal operating point minimises the combined expense rather than maximising accuracy alone.

Common Pitfalls
  • Ignoring base rates makes the threshold overly aggressive or too lax.
  • Optimising accuracy only can be misleading when the costs of false positives and false negatives differ.
  • Thresholds need periodic recalibration as behaviour or business constraints evolve.