Kappa is a nonparametric test that can be used to measure interobserver agreement on imaging studies. Cohen's kappa compares two observers, or in the case of machine learning can be used to compare a specific algorithm's output versus labels. Fleiss' kappa assesses interobserver agreement between more than two observers.
If comparing two observers, the concept behind the test is similar to the chi-squared test. Two 2 x 2 tables are set up: one with the expected values if there were chance agreement, and one with your actual data. Kappa will indicate how much of your interobserver agreement was due to chance.
To find the expected values, find the product of the marginals:
To find the expected value for the +/+ cell: [(O1 + O2) x (O1 +O3)] / total observations
To find the expected value for the -/- cell: [(O3 + O4) x (O2 +O4)] / total observations.
Rating systems for kappa are controversial, as they cannot be proven, but one system classifies kappa values as 2
- >0.75: excellent
- 0.40-0.75: fair to good
- <0.40: poor
Kappa can be extrapolated out to 3+ readers using more elaborate equations. Kappa in that setting assesses if all radiologists involved agree on a finding (more stringent).
Kappa is used for categorical values (e.g. larger vs. smaller, has condition vs. does not have the condition). The Bland-Altman analysis is used for continuous variables.
- 1. Psoter KJ, Roudsari BS, Dighe MK et-al. Biostatistics primer for the radiologist. AJR Am J Roentgenol. 2014;202 (4): W365-75. doi:10.2214/AJR.13.11657 - Pubmed citation
- 2. Fleiss JL. Statistical Methods for Rates and Proportions (Probability & Mathematical Statistics). John Wiley & Sons Inc. ISBN:0471263702. Read it at Google Books - Find it at Amazon
Related Radiopaedia articles
- clinical trials
- descriptive studies
- Bayes' theorem
- sensitivity and specificity
- positive predictive value (PPV)
- negative predictive value (NPV)
- likelihood ratio (LR)
- normal distribution
- type I error
- type II error
- confidence interval
- ROC curve
- retrospective studies
- prospective studies
- analyzes of variance
- non-parametric statistics
- cognitive bias in image perception