The two types of uncertainty

The definition

By definition, there are two types of uncertainty, namely

Aleatoric Uncertainty (or Irreducible Uncertainty / Data Uncertainty): This type of uncertainty arises from the inherent randomness or variability in the data-generating process itself. It’s the “noise” that you can’t eliminate, even if you had a perfect model and infinite data. Think of it as the unpredictable factors that cause outcomes to vary, even under seemingly identical conditions.

Examples: The outcome of a fair dice roll, the inherent sensor noise in a measurement device, or the natural variability in weather patterns.

Characteristic: It cannot be reduced by collecting more data or improving the model.

Epistemic Uncertainty (or Reducible Uncertainty / Model Uncertainty): This type of uncertainty arises from a lack of knowledge or information about the system, model, or parameters. It’s the uncertainty due to what we don’t know. This uncertainty can, in principle, be reduced by acquiring more data, improving the model, or gaining more knowledge about the underlying process.

Examples: Uncertainty in a model’s parameters because you only have a limited dataset to train it, uncertainty about which model structure is best to describe a phenomenon, or uncertainty due to unmeasured variables.

Characteristic: It can be reduced by collecting more data, refining the model, or improving our understanding.

Relation to medical imaging

As I mentioned in The challenge of medical image analysis, medical imaging has the problem of “high dimensionality low sample size” issue, we are certainly facing epistemic uncertainty. Theres’s always a craving for more data, better quality data, and sample with greater diversity and rare disease. Because data is THE MOST EFFECTIVE way to improve the model; more than smart tricks (aug, loss, etc) which certaily help, but eventualy diminishing to zero if not negative return considering all those complexity stacked. And model size and “architectural improvement” seem not scale well ¹given the sample size, the added capacity seems to memorize (overfit) rather than generalize.

But I do believe the Aleatoric Uncertainty is creeping into the picture, more than we expect. Once the data covers a good amount of variation, to a point where the badcase actually does look like a annotation error, even starts to confuse algorithm developer and entry level annotators. I used to think this must be result of annotation quality issue (which is still reducible epistemic uncertainty), since the complexity and obscurity nature of medical image will always have Irreducible variance ² But considering this scenario: you are building a lung nodule detector, which is essentially searching for white isolated blob in a darkspace (lung) full of white branches (bronchus). You are happy with current version and deploy it to production. But suddenly your PM got pinged by unhappy doctors who complain about your system make stupid false positives of reporting food residuals in patients’ stomachs, which located not far below the lung and happened also be white blob in darkspace. Adding little data help in this case, because within context of detector bounding box they are INDISTINGUISHABLE. Adding LOTS of samples may help where you essentially force the model to learn to recognize lung vs stomach from a nodule point of view. In a second thought, you are smart enough to solve the problem by masking out region outside of lung, and only training on the region inside of lung for better data utilization. Not happily ever after, you find out your detector has the tendency to

References

NNUnet revisiting paper ↩
The Maxplank paper? ↩