Cross entropy is a measure of the degree of inequality between two probability distributions. In the context of supervised learning, one of these distributions represents the “true” label for a training example, where the correct responses are assigned a value of 100%.
If p(x) represents the probability distribution of “true” labels from a training example and q(x) represents the “guess” of the machine learning algorithm, the cross-entropy is calculated as follows:
Here, x represents the outcomes which the machine learning algorithm attempts to predict. In radiology, this may be the presence or absence of a pathology (e.g. “fracture” vs “no fracture”) or may be a list of possible pathologies (e.g. “malignancy”, “pneumonia”, etc.)
As the prediction more closely approximates the “correct” answer, the cross-entropy reaches a minimum. Supervised machine learning algorithms seek to adjust network parameters such that the cross-entropy is minimized across training examples – in other words, when the predictions q(x) most closely approximate p(x).
Applications in radiology
If an algorithm seeks to classify chest x-rays as being either “pneumonia” or “no pneumonia”. The algorithm is given a chest x-ray which is known to be in a patient with pneumonia. Assume that the algorithm predicts a 51% chance of the chest x-ray representing pneumonia. In this case:
p(Pneumonia) = 1.00
p(no Pneumonia) = 0.00
q(Pneumonia) = 0.51
q(no Pneumonia) = 0.49
This yields a cross-entropy of 0.67. Conversely, a more accurate algorithm which predicts a probability of pneumonia of 98% gives a lower cross entropy of 0.02.
- 1. Ian Goodfellow, Yoshua Bengio, Aaron Courville. Deep Learning. (2016) ISBN: 9780262035613