Convolutional neural network
A convolutional neural network (CNN) is a particular implementation of a neural network used in machine learning that exclusively processes array data such as images, and is thus frequently used in machine learning applications targeted at medical images.
Architecture
A convolutional neural network typically consists of the following three components although the architectural implementation varies considerably ^{57}:
 input (image, volume or video)
 feature extraction
 classification and output
Input
The most common input is an image, although considerable work has also been performed on socalled 3D convolutional neural networks that can process either volumetric data (3 spatial dimensions) or video (2 spatial dimensions + 1 temporal dimension).
In most implementations, the input needs to be processed to match the particulars of the CNN being used. This may include cropping, reducing the size of the image, identification of a particular region of interest, as well as normalizing pixel values to particular regions.
Feature extraction
The feature extraction component of a convolutional neural network is what distinguishes CNNs from other multilayered neural networks. It typically comprises of repeating sets of three sequential steps:
 convolution layer
 input (image) is convoluted by application of numerous kernels
 each kernel results in a distinct feature map
 pooling layer
 each feature map is downsized to a smaller matrix by pooling the values in adjacent pixels
 nonlinear activation unit
 the activation of each neuron is then computed by the application of this nonlinear function to the weighted sum of its inputs and an additional bias term. This is what gives the neural network the ability to approximate almost any function.
 a popular activation unit is the rectified linear unit (ReLU)
 during convolution and pooling processes results in some pixels in the matrix having negative values
 the rectified linear unit ensures all negative values are at a zero
These three steps are then repeated many times, each convolution layer acting upon the pooled and rectified feature maps from the preceding layer. The result is an ever smaller matrix size with activation dependent on more and more complex features due to the cumulative interaction of numerous prior convolutions.
Classification and output
The final pooled and rectified feature maps are then used as the input of fully connected layers just like in a fully connected neural network, and thus discussed separately.
Training
Most frequently convolutional neural networks in radiology undergo supervised learning. During training both the weighting factors of the fully connected classification layers and the convolutional kernels undergo modification (backpropagation).
Related Radiopaedia articles
Artificial intelligence
 artificial intelligence (AI)
 imaging data sets
 computeraided diagnosis (CAD)
 natural language processing
 machine learning (overview)
 visualizing and understanding neural networks
 common data preparation/preprocessing steps
 DICOM to bitmap conversion
 dimensionality reduction
 scaling
 centering
 normalization
 principal component analysis
 training, testing and validation datasets
 augmentation
 loss function

optimization algorithms
 ADAM
 momentum (Nesterov)
 stochastic gradient descent
 minibatch gradient descent

regularisation
 linear and quadratic
 batch normalization
 ensembling
 rulebased expert systems
 glossary
 activation function
 anomaly detection
 automation bias
 backpropagation
 batch size
 computer vision
 concept drift
 cost function
 confusion matrix
 convolution
 cross validation
 curse of dimensionality
 dice similarity coefficient
 dimensionality reduction
 epoch
 explainable artificial intelligence/XAI
 feature extraction
 federated learning
 gradient descent
 ground truth
 hyperparameters
 image registration
 imputation
 iteration
 jaccard index
 linear algebra
 noise reduction
 normalization
 R (Programming language)
 Python (Programming language)
 segmentation
 semisupervised learning
 synthetic and augmented data
 overfitting
 transfer learning