Natural language processing
Natural language processing (NLP) is an area of active research in artificial intelligence concerned with human languages. Natural language processing programs use human written text or human speech as data for analysis. The goals of natural language processing programs can vary from generating insights from texts or recorded speech to generating text or speech.
The first area of natural language processing to gain wide usage in radiology was speech recognition. In earlier literature, speech recognition was often referred to as voice recognition 1-3, but the trend in nomenclature is towards differentiating voice recognition and speech recognition, with only the latter implying the use of dictated recordings to create reports. In many radiology practices, radiologists use speech recognition programs to create reports routinely.
Increasing research in artificial neural networks has sparked an interest in topic modeling algorithms of natural language processing which can be used to automate the labeling of images. Examples include the NIH chest x-ray data set ChestX-ray8 3.
Due to the brevity, limited vocabulary, and structured nature of radiology reports, many different algorithm types have proven successful at annotation of radiology reports.
Areas of active research for the application of natural language processing in radiology include areas of natural language understanding (NLU) such as topic modelling, other forms of information extraction and keyword searching. Natural language processing also includes natural language generation (NLG).
Practical Points
Several organizations have undertaken efforts to standardize radiology reports 5. One byproduct of standardized reports is that the reports are more amenable to rule based and/or decision tree algorithms for NLP, however at present much progress has been made in interpreting free text by using algorithms that use statistical operations on matrices derived from texts such as the Latent Dirichlet allocation.
Related Radiopaedia articles
Artificial intelligence
- artificial intelligence (AI)
- imaging data sets
- computer-aided diagnosis (CAD)
- natural language processing
- machine learning (overview)
- visualizing and understanding neural networks
- common data preparation/preprocessing steps
- DICOM to bitmap conversion
- dimensionality reduction
- scaling
- centering
- normalization
- principal component analysis
- training, testing and validation datasets
- augmentation
- loss function
-
optimization algorithms
- ADAM
- momentum (Nesterov)
- stochastic gradient descent
- mini-batch gradient descent
-
regularisation
- linear and quadratic
- batch normalization
- ensembling
- rule-based expert systems
- glossary
- activation function
- anomaly detection
- automation bias
- backpropagation
- batch size
- computer vision
- concept drift
- cost function
- confusion matrix
- convolution
- cross validation
- curse of dimensionality
- dice similarity coefficient
- dimensionality reduction
- epoch
- explainable artificial intelligence/XAI
- feature extraction
- federated learning
- gradient descent
- ground truth
- hyperparameters
- image registration
- imputation
- iteration
- jaccard index
- linear algebra
- noise reduction
- normalization
- R (Programming language)
- Python (Programming language)
- segmentation
- semi-supervised learning
- synthetic and augmented data
- overfitting
- transfer learning