Ron Keren, MD, MPH
Childrens Hospital of Philadelphia
University of Pennsylvania School of Medicine
Philadelphia, PA 19104
To the Editor.
In their article comparing a clinical examination-based approach for diagnosing appendicitis with strategies emphasizing radiologic evaluation (with ultrasound and computed tomography), Kosloske et al1 state that prescreening of study patients by another physician before referral did not "affect the epidemiologic measures (sensitivity, specificity, positive predictive value, negative predictive value, and accuracy), because these measures are based on correct diagnoses, not on the proportion of subjects with appendicitis." We would like to correct this assertion and draw attention to the potential role that disease prevalence and spectrum have on these measures of diagnostic accuracy.
The positive predictive value (probability of disease given a positive test) and negative predictive value (probability of no disease given a negative test) of a test actually strongly depend on the prevalence (or prior probability in the case of an individual patient) of the disease in the population (or patient) being studied. Given a fixed test sensitivity and specificity, the positive predictive value of a test will increase and its negative predictive value will decrease as prevalence (or prior probability) of the disease increases. Therefore, in comparing diagnostic approaches, the positive and negative predictive values cannot be compared unless the study populations have the same prevalence of disease. Thus, the superior positive predictive value of Kosloskes pediatric surgical protocol compared with the study by Garcia Pena et al2 of ultrasound plus computed tomography may in large part be due to the higher prevalence of appendicitis in the Kosloske et al study sample (62%), compared with those that entered the Garcia Pena et al imaging protocol (50 of 139 [36%]).
It is commonly assumed that test sensitivity (probability of a positive test given disease) and specificity (probability of a negative test given no disease) are independent of disease prevalence and therefore are the more appropriate metric for comparing the accuracy of diagnostic approaches. However, several studies have demonstrated variation in a diagnostic tests sensitivity and specificity because of the spectrum of clinical and pathologic characteristics of the patients to whom it is applied.3,4 This subgroup variation in test performance has been termed "spectrum bias" or "spectrum effect." For example, it has been shown that studies in populations that have either increased disease severity or disease prevalence uniformly result in increased test sensitivity and occasionally increased test specificity.5,6 We know that the Kosloske et al study sample had a high rate of appendicitis, which may have increased the sensitivity of their diagnostic approach. But we also question whether their patients presented with more severe symptoms, further contributing to inflation of their sensitivity measure. Their patients were referred from a large number of surrounding rural counties, compared with other studies conducted in urban tertiary care centers, with a large proportion of the appendicitis patient population residing in close proximity to the hospital. The time from onset of symptoms to the institution of a diagnostic protocol are likely to be greater in rural settings, compared with urban ones, and thus the severity of appendicitis at presentation may be greater in rural settings, resulting in an inflated estimate of sensitivity of diagnostic protocols.
Given these observations, care must be taken not only in the comparison of the test properties of various proposed diagnostic strategies for acute appendicitis but for cost analyses of such approaches as well. For example, suspected appendicitis patients subject to identical diagnostic protocols but presenting from a location near to the hospital (urban setting) or from far away (as might occur with an outlying rural population) will differ in the stage of presentation and thus differ also in the amount of observation time and number of ancillary tests and imaging studies necessary to make a definitive diagnosis. Simply put, a diagnostic protocol implemented in a rural setting may seem more cost-effective than an otherwise identical protocol implemented in an urban setting.
The argument by Kosloske et al for early pediatric surgical evaluation and clinical examination over imaging-based approaches is a very sound one, but additional research must be conducted to determine if the test characteristics and cost-effectiveness of such a strategy are generalizable to other clinical settings.
REFERENCES
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||