Overconfidence and Underconfidence: How to Diagnose and Fix

Published on January 1, 2026
Overconfidence and Underconfidence: How to Diagnose and Fix

Two common calibration failure modes

If you forecast probabilities long enough, you will see one of these patterns:

• overconfidence: you assign probabilities that are too high relative to reality

• underconfidence: you assign probabilities that are too low relative to reality

Both are forms of miscalibration and both show up clearly in a calibration table.

How to diagnose overconfidence

You are overconfident when the realized frequency is below your predicted probability in the higher buckets.

Example pattern:

• your 0.80 bucket resolves YES only 0.62 of the time

• your 0.70 bucket resolves YES only 0.55 of the time

This means your “80%” behaves more like “62%”.

How to diagnose underconfidence

You are underconfident when realized frequency is above your predicted probability in the higher buckets.

Example pattern:

• your 0.60 bucket resolves YES 0.75 of the time

• your 0.70 bucket resolves YES 0.82 of the time

This means you are too timid. You should push probabilities away from 0.50 more when evidence supports it.

First check: is it real or just noise

Before you “fix” anything, check:

• bucket counts (sample size)

• whether deviations are consistent across multiple buckets

• whether you are mixing unlike categories or horizons

If the buckets are tiny, your pattern may be random. Use fewer buckets or evaluate over a longer window.

The simplest fix: probability mapping

The fastest practical calibration fix is to apply a consistent mapping from your raw probabilities to a calibrated probability.

Two common approaches:

1) Shrink toward 0.50

This helps overconfidence. You compress extremes toward the middle.

Example idea:

• map 0.90 to 0.80

• map 0.80 to 0.70

• map 0.70 to 0.62

Then re score your forecasts and check whether calibration improves.

2) Stretch away from 0.50

This helps underconfidence. You increase sharpness by moving probabilities farther from the middle.

Example idea:

• map 0.55 to 0.60

• map 0.60 to 0.70

• map 0.70 to 0.80

Bucket based mapping: a practical method

You can build a mapping directly from your calibration table:

• take each bucket

• map its average predicted probability to its realized frequency

Example:

• your 0.78 average bucket resolves at 0.63

• so your map sends 0.78 to 0.63

This is a simple, data driven correction.

Common causes and what to do

Cause: base rate neglect

If you regularly ignore the base rate, you will tend to become overconfident. Fix by anchoring on base rates first and moving away only with evidence.

Cause: mixing horizons

Late forecasts are easier. If you mix early and late forecasts, you can create fake patterns. Fix with evaluation checkpoints or horizon splits.

Cause: herding then reversing

If you follow market consensus and then swing away emotionally, you can produce unstable calibration. Fix by setting update rules and keeping an audit trail.

How to know your fix worked

Look for:

• improved calibration curve closer to the diagonal

• better bucket stability across time windows

• improved headline Brier score and Brier skill score

Takeaway

Overconfidence and underconfidence are calibration problems you can diagnose directly from calibration buckets. Start by verifying sample size. Then apply simple probability mapping: shrink toward 0.50 for overconfidence, stretch away for underconfidence, and re score to validate the improvement.

Related

Overconfidence

Underconfidence

Calibration

Calibration Table

How to Read a Calibration Curve and Table

← Back to Guides