Augmentation
Augmentation is a process of artificial data generation, which produces a greater volume of data, and thus increasing the likelihood of obtaining higher predictive accuracy of a predictive model.
Usually, a higher volume of data is likely to yield better predictive and more accurate models from training as the algorithm is able to see a greater variety of examples. However, it is not always possible to collect a large amount of data, hence augmentation is required to generate sufficient data to train an accurate predictive model. This is particularly relevant for datasets with images. There are many methods of generating new training examples with images. These include:
- mirroring the image
- adding noise to the image
- distorting the image
Augmentation creates augmented data. Augmented data is based on systematic modification of existing data (with images often through simple linear algebra operations on the whole image) as opposed to synthetic data.
Related Radiopaedia articles
Artificial intelligence
- artificial intelligence (AI)
- imaging data sets
- computer-aided diagnosis (CAD)
- natural language processing
- machine learning (overview)
- visualizing and understanding neural networks
- common data preparation/preprocessing steps
- DICOM to bitmap conversion
- dimensionality reduction
- scaling
- centering
- normalization
- principal component analysis
- training, testing and validation datasets
- augmentation
- loss function
-
optimization algorithms
- ADAM
- momentum (Nesterov)
- stochastic gradient descent
- mini-batch gradient descent
-
regularisation
- linear and quadratic
- batch normalization
- ensembling
- rule-based expert systems
- glossary
- activation function
- anomaly detection
- automation bias
- backpropagation
- batch size
- computer vision
- concept drift
- cost function
- confusion matrix
- convolution
- cross validation
- curse of dimensionality
- dice similarity coefficient
- dimensionality reduction
- epoch
- explainable artificial intelligence/XAI
- feature extraction
- federated learning
- gradient descent
- ground truth
- hyperparameters
- image registration
- imputation
- iteration
- jaccard index
- linear algebra
- noise reduction
- normalization
- R (Programming language)
- Python (Programming language)
- segmentation
- semi-supervised learning
- synthetic and augmented data
- overfitting
- transfer learning