Cross entropy

Last revised by Daniel J Bell on 2 Aug 2021

Cross entropy is a measure of the degree of inequality between two probability distributions. In the context of supervised learning, one of these distributions represents the “true” label for a training example, where the correct responses are assigned a value of 100%.

Machine learning

If p(x) represents the probability distribution of “true” labels from a training example and q(x) represents the “guess” of the machine learning algorithm, the cross-entropy is calculated as follows:


Here, x represents the outcomes which the machine learning algorithm attempts to predict. In radiology, this may be the presence or absence of a pathology (e.g. “fracture” vs “no fracture”) or may be a list of possible pathologies (e.g. “malignancy”, “pneumonia”, etc.) 

As the prediction more closely approximates the “correct” answer, the cross-entropy reaches a minimum. Supervised machine learning algorithms seek to adjust network parameters such that the cross-entropy is minimized across training examples – in other words, when the predictions q(x) most closely approximate p(x).

Applications in radiology 

If an algorithm seeks to classify chest x-rays as being either “pneumonia” or “no pneumonia”. The algorithm is given a chest x-ray which is known to be in a patient with pneumonia. Assume that the algorithm predicts a 51% chance of the chest x-ray representing pneumonia. In this case:

                                        p(Pneumonia) = 1.00
                                        p(no Pneumonia) = 0.00
                                        q(Pneumonia) = 0.51
                                        q(no Pneumonia) = 0.49

This yields a cross-entropy of 0.67. Conversely, a more accurate algorithm which predicts a probability of pneumonia of 98% gives a lower cross entropy of 0.02.

ADVERTISEMENT: Supporters see fewer/no ads

Updating… Please wait.

 Unable to process the form. Check for errors and try again.

 Thank you for updating your details.