Cohort studies vs case control studies

Last revised by Stefan Tigges on 9 Jan 2024

Cohort and case control studies are two different types of observational studies used to determine if there is an association between exposure and a health outcome1. The studies are observational and not experimental because, unlike a randomized clinical trial, the investigators do not assign subjects to the exposed and unexposed groups. The lack of randomization makes observational studies more vulnerable to a type of bias called confounding. Exposures may be harmful (e.g. smoking) or protective (e.g. vaccines) and outcomes include the development of disease (e.g. lung cancer) or the avoidance of disease (e.g. infection). Both studies are also considered longitudinal because study subjects are followed for a specified length of time.

Cohort studies are usually carried out prospectively and begin by defining a population at risk for a disease, then establishing which members of the cohort are exposed and which are unexposed. The cohort is then followed and the number of cases of disease is determined for the exposed and unexposed groups. If the study period is short (e.g. flu season) and the losses to follow-up are negligible, the probability or risk of developing the disease in question is calculated for both groups:

  • risk in exposed = (number of exposed with the outcome of interest) / (total number of exposed)

  • risk in unexposed = (number of unexposed with the outcome of interest) / (total number of unexposed)

The measure of association in a cohort study is the risk ratio:

  • risk ratio = risk in exposed / risk in unexposed

If the probability of disease is higher among the exposed, then the exposure is harmful and the risk ratio is >1. If the probability of disease is lower among the exposed, then the exposure is protective and the risk ratio is <1.

If a cohort study is lengthy, patients may be lost to follow up and it may be impossible to accurately calculate risks in the exposed and unexposed groups and thus a risk ratio. For cohort studies that go on for many years, an incidence rate is calculated, where instead of using the total number of exposed or unexposed in the denominator the number of person-years contributed by all of the subjects in the exposed or unexposed groups make up the denominator:

  • incidence rate in exposed = (number of exposed with the outcome of interest) / (person-years among exposed)

  • incidence rate in unexposed = (number of unexposed with the outcome of interest) / (person-years among unexposed)

The measure of association then calculated is the rate ratio:

  • rate ratio = incidence rate in exposed / incidence rate in unexposed

Doll and Hill’s 1956 cohort study comparing the rate at which smokers and non-smokers developed lung cancer found an incidence rate among smokers of 0.84 lung cancer deaths/1,000 person-years and for non-smokers 0.07 lung cancer deaths/1,000 person-years2. This yields a rate ratio of 0.84/0.07 or 12. This means that smokers developed lung cancer at a 12 times greater rate than non-smokers.

If the rate of disease is higher among the exposed, then the exposure is harmful and the rate ratio is >1. If the rate of disease is lower among the exposed, then the exposure is protective and the rate ratio is <1.

Case control studies are usually carried out retrospectively and begin by identifying subjects who have a particular disease (cases), then establishing which of the cases have the exposure of interest. Next, a group of subjects without the disease (controls) is chosen from the underlying population and the number of controls who have the exposure of interest is determined. Because we sample subjects with and without the disease, the proportion of subjects with the disease depends on how the subjects were sampled and not on the probability (risk) of developing the disease. Calculating risks and risk ratios is therefore incorrect. Instead, an odds ratio is calculated and is the measure of association for a case control study3.  

The odds of being exposed among the cases and controls are calculated:

  • odds of exposure among cases = (number of cases exposed) / (number of cases unexposed)

  • odds of exposure among controls = (number of controls exposed) / (number of controls unexposed)

Next, the odds ratio is calculated:

  • odds ratio = odds of exposure among cases/odds of exposure among controls

Doll and Hill’s 1950 case control study of smoking and lung cancer found 688 smokers and 21 non-smokers in their cases for an odds of exposure of 688/21=32.84. Among the controls, there were 650 smokers and 59 non-smokers for an odds of exposure of 650/59=11.0. This yields an odds ratio of 32.8/11.0=2.98, meaning that the odds of smoking among patients with lung cancer are nearly 3 times greater than the odds of smoking among patients without lung cancer.

If the odds ratio is >1, the odds of exposure are higher among the cases than the controls and the exposure is harmful. If the odds ratio is <1, the odds of exposure are lower among the cases than the controls and the exposure is protective.

  • because most cohort studies are prospective and require meticulous follow-up, cohort studies tend to be more expensive and take longer to complete than case control studies

  • because case control studies start with cases, they are efficient for studying rare diseases compared to cohort studies which would require enrolling many subjects to ensure that some subjects develop a rare disease

  • if a disease is rare, the odds ratio approximates the risk ratio

Updating… Please wait.

 Unable to process the form. Check for errors and try again.

 Thank you for updating your details.