to digital radiography clearly has many practical advantages in terms of data
storage and mobility, it would not have been implemented clinically had the
diagnostic quality of the scans decreased. Quantitative assessment of diagnostic
quality is usually reported in terms of specificity and sensitivity, as described in the
example below.
Consider an imaging study to determine whether a group of middle-aged
patients has an early indication of multiple sclerosis. It is known that this
disease is characterized by the presence of white matter lesions in the brain.
However, it is also known that healthy people develop similar types of lesion as
they age, but that the number of lesions is not as high as for multiple sclerosis
cases. When analyzing the images from a particular patient there are four
possible outcomes for the radiologist: a true positive (where the first term ‘true’
refers to a correct diagnosis and the second term ‘positive’ to the patient having
multiple sclerosis), a true negative, a false positive or a false negative. The four
possibilities can be recorded in either tabular or graphical format, as shown in
Figure 1.2. The receiver operating characteristic (ROC) curve plots the number of
true positives on the vertical axis vs. the number of false positives on the horizontal
axis, as shown on the right of Figure 1.2. What criterion does the radiologist use to
make his/her diagnosis? In this simple example assume that the radiologist simply
counts the number of lesions detectable in the image. The relative number of true
positives, true negatives, false positives and false negatives depends upon the
particular number of lesions that the radiologist decides upon as being the thresh-
old for diagnosing a patient with multiple sclerosis. If this threshold number is very
high, for example 1000, then there will be no false positives, but no true positives
either. As the threshold number is reduced then the number of true positives will
increase at a greater rate than the false positives, providing that the images are
giving an accurate count of the number of lesions actually present. As the criterion
for the number of lesions is reduced further, then the numbers of false positives
and true positives increase at a more equal rate. Finally, if the criterion is
dropped to a very small number, then the number of false positives increases
much faster than the true positives. The net effect is to produce a curve shown
in Figure 1.2.
Three measures are commonly reported in ROC analysis:
(i) accuracy is the number of correct diagnoses divided by the total number of
diagnoses;
(ii) sensitivity is the number of true positives divided by the sum of the true
positives and false negatives; and
(iii) specificity is the number of true negatives divided by the sum of the number of
true negatives and false positives.
3
1.2 Specificity, sensitivity and the ROC curve