Transfer learning

Last revised by Dimitrios Toumpanakis on 2 Aug 2021

Citation, DOI, disclosures and article data

Citation:

Chmiel E, Toumpanakis D, Murphy A, Transfer learning. Reference article, Radiopaedia.org (Accessed on 24 Apr 2024) https://doi.org/10.53347/rID-69192

DOI:

https://doi.org/10.53347/rID-69192

Permalink:

https://radiopaedia.org/articles/69192

rID:

69192

Article created:

2 Jul 2019, Edward Chmiel

Disclosures:

At the time the article was created Edward Chmiel had no recorded disclosures.

View Edward Chmiel's current disclosures

Last revised:

2 Aug 2021, Dimitrios Toumpanakis

Disclosures:

At the time the article was last revised Dimitrios Toumpanakis had no recorded disclosures.

View Dimitrios Toumpanakis's current disclosures

Revisions:

4 times, by 3 contributors - see full revision history and disclosures

Sections:

Artificial Intelligence

The concept of transfer learning in artificial neural networks is taking knowledge acquired from training on one particular domain and applying it in learning a separate task.

In recent years, a well-established paradigm has been to pre-train models using large-scale data (e.g., ImageNet) and then to fine-tune the models on target tasks that often have less training data ³. For example, a neural network that has previously been trained to recognise pictures of animals may more effectively learn how to categorise pathology on a chest x-ray. In this example, the initial training of the network in animal image recognition is known as “pre-training”, while training on the subsequent data set of chest x-rays is known as “fine tuning”. This tool is most useful when the number of training examples in the pre-training data set is relatively large (e.g. 100,000 animal images) while the fine-tuning data set is relatively small (e.g. 200 chest x-rays).

The most popular dataset used for pre-training is the ImageNet dataset ⁵, a very large dataset containing more than 14 million annotated images ⁴.

Intuition

The initial layers in a neural network for most image recognition tasks are involved in recognising simple features such as edges and curves. As such, a network which has been pre-trained on an unrelated image recognition task has already learned to see these lower level features. A network already pre-trained on images of animals does not need to re-learn such features, and is, therefore, able to train for the task of recognising chest x-ray pathology with fewer training examples.