Objective. Computed tomography (CT) has gained widespread acceptance in the evaluation of children with suspected appendicitis. Concern has been raised regarding the long-term effects of ionizing radiation. Other means of diagnosing appendicitis, such as clinical scores, are lacking in children. We sought to develop a clinical decision rule to predict which children with acute abdominal pain do not have appendicitis.
Methods. Prospective cohort study was conducted of children and adolescents who aged 3 to 18 years, had signs and symptoms suspicious for appendicitis, and presented to the emergency department between April 2003 and July 2004. Standardized data-collection forms were completed on eligible patients. Two low-risk clinical decision rules were created and validated using logistic regression and recursive partitioning. The sensitivity, negative predictive value (NPV), and negative likelihood ratio of each clinical rule were compared.
Results. A total of 601 patients were enrolled. Using logistic regression, we created a 6-part score that consisted of nausea (2 points), history of focal right lower quadrant pain (2 points), migration of pain (1 point), difficulty walking (1 point), rebound tenderness/pain with percussion (2 points), and absolute neutrophil count of >6.75 × 103/μL (6 points). A score ≤5 had a sensitivity of 96.3% (95% confidence interval [CI]: 87.5–99.0), NPV of 95.6% (95% CI: 90.8–99.0), and negative likelihood ratio of .102 (95% CI: 0.026–0.405) in the validation set. Using recursive partitioning, a second low-risk decision rule was developed consisting of absolute neutrophil count of <6.75 × 103/μL, absence of nausea, and absence of maximal tenderness in the right lower quadrant. This rule had a sensitivity of 98.1% (95% CI: 90.1–99.9), NPV of 97.5% (95% CI: 86.8–99.9), and negative likelihood ratio of 0.058 (95% CI: 0.008–0.411) in the validation set. Theoretical application of the low-risk rules would have resulted in a 20% reduction in CT.
Conclusions. Our low-risk decision rules can predict accurately which children are at low risk for appendicitis and could be treated safely with careful observation rather than CT examination.
Appendicitis is the most common surgical emergency in children, with >70000 pediatric appendectomies performed annually in the United States.1,2 The diagnosis of appendicitis in children remains challenging because of the overlapping symptoms of many childhood illnesses and the preverbal state of young children.3,4
Delayed diagnosis of appendicitis is associated with increased morbidity, mortality, and health care costs.5–9 Recent advances in computed tomography (CT) have led to modest improvements in the negative appendectomy rate and in the diagnosis of appendicitis.10–14 However, the indiscriminate use of CT carries significant risks as a result of increased exposure to ionizing radiation and may result in increased health care costs.15–17 These concerns have led to renewed interest in clinical scores to better diagnose appendicitis.
The “MANTRALS” score was proposed by Alvarado in 1986 as a method to predict appendicitis in adults.18 Subsequent studies to validate the score in adult and pediatric cohorts have had mixed results.19–22 Other clinical scores have been proposed to predict appendicitis in children; however, they have suffered from small sample size or have not been validated.23–25 In this study, we prospectively enrolled children who were under evaluation for appendicitis to develop a clinical rule that is specific to a pediatric population. We sought to develop and validate low-risk criteria to define patients who could be observed or discharged safely without reliance on CT.
Study Setting and Population
This study was conducted at an urban, tertiary-care, pediatric emergency department (ED) that has ∼52000 visits per year. From April 2003 to July 2004, we prospectively enrolled children who were between 3 and 18 years of age and underwent surgical consultation for possible appendicitis. The hospital's clinical practice guideline mandates that a surgeon (a fourth-year surgical resident or attending) evaluate all patients who present to the ED with signs and symptoms suspicious for appendicitis before any diagnostic imaging, such CT or ultrasound (US). The decision to obtain a CT, US, or both was left to the joint decision of the surgeon and pediatric emergency medicine (PEM) attending. Commonly, female adolescents would undergo US before any CT evaluation. Patients were excluded when they were pregnant, had undergone previous abdominal surgery, had a chronic medical conditions (eg, cystic fibrosis, inflammatory bowel disease, sickle cell anemia), or had radiologic studies (CT or US) of the abdomen within the previous 2 weeks. Patients who had laboratory studies or plain radiographs before their ED evaluation were not excluded.
Standardized Patient Assessment and Data Collection
Data-collection forms were completed by the PEM attending who was responsible for the patient's care. Forms were completed before diagnostic imaging and independent of the surgeon's evaluation. Physicians were introduced to the data-collection forms during a 1-hour session before the beginning of the study. The standardized data-collection forms consisted of 24 demographic, historical, and physical examination variables. Historical elements were history of fever, nausea, anorexia, emesis, and diarrhea; number of hours of pain; migration of pain; history of focal right lower quadrant (RLQ) pain; pain onset; pain quality; and ability to walk. Physical examination variables were location of abdominal tenderness and point of maximal tenderness; presence of tenderness with percussion/cough/hopping; rebound tenderness; guarding; rectal tenderness; bowel sounds; costovertebral angle tenderness; and psoas, obturator, or Rovsing's sign. For patients who underwent a pelvic examination, presence of adnexal pain and/or cervical motion tenderness was recorded.
The patient's medical record was abstracted by one author (A.B.K.) for laboratory (white blood cell [WBC] count, percentage of neutrophils, and absolute neutrophil count [ANC]), radiology, pathology, and operative reports. Data were entered into a secure data management program by a single, supervised research assistant. All entered data were double-checked for accuracy. To determine our capture rate, we reviewed the ED electronic charting system, the ED admission log, and the pathology database.
Our main outcome was the presence or absence of appendicitis. Final diagnosis was determined by pathology for patients who had an appendectomy. A perforated appendix was determined by the attending surgeon's written postoperative diagnosis. For patients who did not have surgery, the outcome was confirmed by a follow-up telephone call 2 to 4 weeks after the ED visit. When the family could not be reached, the patient's pediatrician was contacted to determine the final diagnosis.
This study was approved by the hospital's committee on clinical investigations. Informed consent was obtained from all participating PEM physicians. Informed consent was also obtained from all parents, and assent was obtained from children who were older than 7 years.
Data Analysis: Derivation of Clinical Decision Rule
We performed χ2 testing for categorical variables to identify potential predictors for appendicitis (SPSS 11.5; SPSS Inc, Chicago, IL). Threshold values (cutoffs) for continuous variables (WBC count and ANC) were obtained from univariate recursive partitioning analyses. Variables that were highly associated with appendicitis (P < .001) were analyzed by 2 separate statistical methods, logistic regression and recursive partitioning, to create a clinical decision rule (see below). Patients who were enrolled from April 2003 through February 2004 were included in the derivation set.
Variables that were highly associated with appendicitis (P < .001 on χ2 testing) and had <10% missing data were selected for model creation. Selected predictors were entered in a logistic-regression model using backward stepwise elimination. To develop the clinical score, the β coefficients for the retained predictors were converted into integer values by dividing by the lowest β value and rounding any decimals. The score that maximized the negative likelihood ratio and negative predictive value (NPV) was chosen as the low-risk score cutoff.
We used recursive partitioning (CART 5.0; Salford Systems, San Diego, CA) software to create decision trees. Variables that were highly associated with appendicitis (P < .001 on χ2 testing) were entered into the model. The Gini method for classification trees and 10-fold cross-validation was used. Misclassifying a patient with appendicitis as low risk was weighted to be 20 times worse than misclassifying a patient without appendicitis as non–low risk.
Validation of Clinical Decision Rules and Score Performance
The validation set was composed of patients who were enrolled consecutively from March 2004 to July 2004. The clinical decision rules that were derived from logistic regression and recursive partitioning were applied to the validation set. Sensitivity, specificity, and negative likelihood ratio of the clinical scores were determined.
During the 15-month study period, 4140 patients who were between 3 and 18 years of age presented to the ED with a chief complaint of abdominal pain. Of these, 767 (19%) underwent surgical consultation to evaluate for possible appendicitis. A total of 113 patients met criteria for exclusion, leaving 654 patients eligible for enrollment. Fifty-three patients were missed or not approached to participate, and 601 patients were enrolled (425 in the derivation set and 176 in the validation set), for a capture rate of 92%.
The median age was 11.6 years (interquartile range: 8.2–14.6). A total of 307 (51%) patients were male. A total of 211 (35%) patients received a diagnosis of appendicitis. Thirty-eight (22%) of the 211 patients with appendicitis had a perforated appendix. Follow-up was completed on 593 (99%) patients. No patient who was discharged presented to another hospital for an appendectomy. The derivation and validation groups did not differ significantly by age, gender, or proportion with appendicitis or perforated appendicitis (Table 1). The clinical characteristics of the patients enrolled are detailed in Table 1. The eligible patients who were not enrolled were of a similar age and gender as the enrolled cohort but had a significantly lower rate of appendicitis (13%).
Clinical Course and Disposition
After examination by the PEM attending, 507 (84%) patients underwent diagnostic imaging. A total of 416 (69%) patients were imaged with CT, 219 (36.4%) with US, and 128 (21%) with both. After evaluation, 303 (50%) patients were discharged from the hospital, 221 (37%) went directly to the operating room, and 77 (13%) were admitted for observation. Seventeen patients who were taken to the operating room had a normal appendix, for a negative appendectomy rate of 8.0%. Six patients who were discharged from the hospital ultimately underwent an appendectomy, 4 of whom had appendicitis.
Derivation of Clinical Decision Rule: Univariate Analysis
Recursive partitioning analysis determined the ideal cut points for the continuous variables ANC and WBC count to be 6.75 × 103/μL and 8.85 × 103/μL, respectively. These variables were converted to categorical variables on the basis of these cut points. The relationship between each categorical variable and appendicitis was evaluated with χ2 analysis (Table 2).
Twelve of 24 clinical predictors met criteria for entry into backward step-wise bivariate logistic regression. Six predictors were retained from this analysis: (1) nausea, (2) history of focal RLQ pain, (3) migration of pain, (4) difficulty walking, (5) rebound tenderness, and (6) ANC >6.75 × 103/μL (Table 3). Rebound pain and pain with percussion had a high degree of co-linearity and clinical overlap; therefore, the variables were combined. The diagnostic weights that were applied to each of the 6 predictors are listed in Table 3. The clinical score that was developed using logistic regression was applied to the derivation group of patients. A score of ≤5 identified 108 (25%) patients, 106 who did not have appendicitis and 2 who did (Fig 1). The sensitivity and the NPV for this score cutoff are 98.7% (95% confidence interval [CI]: 95.5–99.9) and 98.1% (95% CI: 93.5–99.7), respectively. The sensitivity and the specificity are described graphically in a receiver-operator curve (Fig 2). The negative likelihood ratio (LR) for a score ≤5 is .032 (95% CI: 0.008–0.128).
Validation of Logistic-Regression Model
The clinical score that was created with logistic regression was applied to the 176 patients in the validation set. Table 4 outlines the performance of the score on this validation set. A score of ≤5 would identify 46 patients, 44 who did not have appendicitis. The sensitivity and the NPV for this score cutoff are 96.3% (95% CI: 87.5–99) and 95.6% (95% CI: 90.8–99), respectively. The negative LR is 0.102 (95% CI: 0.026–0.405).
Sixteen variables met criteria for entry into the recursive partitioning model. Recursive partitioning analysis created a tree that used ANC >6.75 × 103/μL, nausea, and maximal tenderness in the RLQ. The low-risk decision tree identified 91 (21%) patients, none of whom had appendicitis (Fig 3). Surrogate variables for nausea included the presence of emesis and anorexia. A history of focal RLQ pain was a surrogate for maximal tenderness in the RLQ. In the derivation set, the sensitivity for this low-risk decision rule is 100% (95% CI: 97.7–100) with an NPV of 100% (95% CI: 96.0–100). The negative LR for the decision tree is 0 (upper 95% CI estimated to be <0.001; Table 4).
Validation of Recursive Partitioning Model
The decision tree that was created by recursive partitioning was applied to the 176 patients in the validation set. The low-risk decision rule identified 40 patients, 1 of whom had appendicitis. This provides a sensitivity of 98.1% (95% CI: 90.1–99.9) and an NPV of 97.5% (95% CI: 86.8–99.9; Table 4). The negative LR is 0.058 (95% CI: 0.008–0.411).
Reduction in CT Utilization and Operative Care
We applied to our entire study cohort the clinical decision rules that were created from logistic regression and recursive partitioning to determine which patients theoretically could have avoided CT or operative care. Application of the clinical rules such that low-risk patients were observed rather than imaged would have reduced the use of CT by 23% for the logistic regression model and 20% for the recursive partitioning model. Appendectomies were performed in 6 low-risk patients, as classified by both models; all of these low-risk patients had a false-positive CT examination, a normal appendix on pathology, and no other indications for surgery.
Patients Who Were Missed by Clinical Decision Rules
A total of 4 patients with appendicitis were misclassified as low risk in our models. Logistic regression missed 2 patients in the derivation set and 2 in the validation set. Recursive portioning missed 0 patients in the derivation set and 1 patient in the validation set. None of these 4 patients was found to have a perforated appendix in the operating room. Table 5 summarizes the clinical characteristics of these 4 patients.
We report the largest prospectively designed and validated study to create a decision rule for appendicitis in children. We evaluated 601 children with signs and symptoms suggestive of appendicitis. Using logistic regression and recursive partitioning, we derived and validated 2 clinical decision rules to identify children who were at low risk for appendicitis. Application of either low-risk rule to patients with signs and symptoms suspicious for appendicitis would lead to decreased reliance on CT. In addition, use of our clinical rules could potentially prevent a small group of patients who are identified as low risk for appendicitis from undergoing an appendectomy.
Previous authors have developed clinical scores to predict appendicitis in adults and children. The first score to gain widespread attention was described by Alvarado. The score components were migration (1 point), anorexia (1 point), nausea (1 point), tenderness in the RLQ (2 points), rebound pain (1 point), elevated temperature (1 point), leukocytosis (2 points), and neutrophilia (1 point). This score was derived retrospectively in adults and had a reported sensitivity of 75% and specificity of 84%.18 Prospective evaluation of the Alvarado score in children has shown varied results, with sensitivities of 76% to 90% and specificities of 72% to 79%.20,21,26 The Alvarado score's performance is unacceptably low for current clinical practice.
To address this issue further, investigators have attempted to derive clinical scores for appendicitis in pediatric cohorts. Samuel derived the “Pediatric Appendicitis Score” prospectively over 5 years in 1170 children using logistic regression. Variables that were included in the score were anorexia, nausea/emesis, migration of pain, tenderness in the RLQ, cough/hopping/percussion tenderness, pyrexia, leukocytosis, and neutrophilia. He reported a sensitivity of 100%, specificity of 92%, positive predictive value of 96%, and NPV of 99%.24 Unfortunately, the score was not validated. We applied the Samuel score to our cohort and found a significantly lower sensitivity (83%) and NPV (88%). More recently, van den Broek et al25 prospectively derived a clinical score to predict appendicitis in 99 children, also using logistic regression. The score was validated at 2 other clinical centers in the Netherlands. They found a temperature of >38°C, WBC count of >10.10 × 103/μL, and rebound tenderness to be correlated with appendicitis. The sensitivity and specificity were 89% and 85%, respectively. The authors concluded that pediatric patients who have a leukocyte count of <10.10 × 103/μL and lack rebound tenderness could be observed safely.25 Although this study contained small numbers of patients and was conducted in an era before significant use of CT (1994–1995), it revealed that a low-risk group of patients with suspected appendicitis could be identified.
Our clinical decision rules expand on this previous research by defining further a group of children who have significant abdominal pain and do not have appendicitis. Our goal was to identify children who could be observed safely without CT examination. Low-level exposure to ionizing radiation is not without risks.27,28 For example, Brenner et al17 estimated that of the 600000 children who undergo CT scans each year, 500 will develop cancer as a direct result of their radiation exposure. Given this concern, physicians should consider strategies that minimize radiation exposure. Use of our clinical decision rules theoretically would have led to a 20% reduction in CT utilization in our cohort. An unanticipated additional benefit of the models was the identification of several patients who were classified as low risk by the models and had false-positive CT scans that led to a negative appendectomy.
Physicians may be reluctant to adopt our low-risk decision rules given previous studies that demonstrated a reduction in negative appendectomy rates and perforation rates associated with the use of CT.12–14,29–32 However, diagnostic imaging may not be as useful as previously believed. Recently, Partrick et al33 and Martin et al34 reported in large cohorts that despite increase in CT scan utilization, the negative appendectomy rates and perforation rates remained unchanged. In addition, in 1999, Karakas et al35 reported on the utility of CT and US in 633 children with suspected appendicitis. Ironically, in this study, the perforation rate increased substantially among pediatric patients who underwent a CT scan and US before appendectomy. The authors postulated that the delay associated with obtaining these studies before surgery could be responsible for the increased perforation rate.35 Finally, in 2001, Flum et al36 published a population-based analysis of appendicitis in Washington State. They reviewed 85790 patients who underwent an appendectomy and found no change in the negative appendectomy rate despite an increase in CT utilization.36 It is interesting that the authors of the first major study on the utility of CT in children foresaw potential for problems, writing “the indiscriminate use of CT rectal contrast could potentially result in delay in diagnosis as well as unnecessary radiation exposure.”10
Our study has several strengths. First, our decision rules were derived prospectively in a large pediatric cohort. We were able to follow up on essentially all patients who were enrolled in our study, thus confirming the patients' final diagnosis. By using 2 statistical techniques, we have provided the clinician the option of a clinical score (logistic regression) or a decision tree (recursive partitioning), recognizing that some may prefer one type to the other. By validating our decision rules in a separate group of patients, we have accounted for the possibility of overfitting of data in our derivation set. Most important, our models' high NPVs and low likelihood ratios in the validation groups indicate significant clinical utility.
One surprising finding of our study is the strong association between low ANC and the absence of appendicitis among patients with signs and symptoms suspicious for appendicitis. In support of our finding, a recently published meta-analysis of clinical and laboratory predictors of appendicitis found that low levels of serum inflammatory markers were strongly associated with the absence of appendicitis. In this study, the presence of a WBC count of <9.0 × 103/μL and <75% neutrophils had a negative likelihood ratio of 0.17 (95% CI: 0.07–0.42) for appendicitis.37 In addition, our results are similar to those of Garcia Pena et al,11 who also used recursive partitioning to stratify children into risk groups for appendicitis. This retrospective study evaluated 958 children who were admitted to the hospital with suspected appendicitis and found that neutrophils <67% and bands <5% could identify patients who did not have appendicitis. This relationship between low ANC and absence of appendicitis may exist only among a highly selective group of patients in whom the clinician has a strong clinical suspicion of appendicitis.
It is important to note the following limitations of our study. We used data-collection forms, an inherently subjective method for gathering data. In addition, we did not calculate κ values for the clinical variables that we collected on individual patients. Ethically, it would have been concerning to have multiple physicians conduct physical examinations on children who were in pain if these additional examinations would be unlikely to affect the patients' care. Despite these limitations, our findings agreed with previous research that nausea, emesis, migration of pain to RLQ, and rebound pain were clinical variables strongly associated with appendicitis.23,38–43 Furthermore, our recursive partitioning analysis revealed that anorexia and emesis were surrogates for nausea in the low-risk rule, supporting the clinical utility of this variable. Next, the decision to obtain a surgical consultation, part of the entry criteria into our study, was based on a clinical practice guideline specific to this institution. The threshold for obtaining a surgical consult is likely to vary at other institutions. Last, our decision rules were not perfect. In the validation sets, 2 patients were missed by the logistic-regression model and 1 patient was missed by the recursive partitioning model. The patient who was misclassified by recursive partitioning was an 11-year-old girl who underwent US and CT during her first visit in the ED, both of which were negative for appendicitis. She was discharged from the hospital but returned 48 hours later for an appendectomy (no perforation). Before our low-risk rules are applied clinically, they must be validated in other clinical settings. Most important, the decision rules can be used only for this highly selective population of pediatric patients with suspected appendicitis; application of the rules to all pediatric patients with abdominal pain would be inappropriate and thereby mislead the clinician.
We have derived 2 clinical rules to identify patients who are at low risk for appendicitis. Although both decision rules performed similarly, some clinicians may prefer a clinical score rather than a decision tree. Logistic regression was used to develop a 6-part score that consists of nausea (2 points), history of focal RLQ pain (2 points), migration of pain (1 point), difficulty walking (1 point), rebound tenderness/pain with percussion (2 points), and ANC >6.75 × 103/μL (6 points). A score ≤5 achieved high sensitivity and NPV in both our derivation and validation groups. Using recursive partitioning, we determined that the combination of ANC <6.75 × 103/μL, absence of nausea (or emesis or anorexia), and absence of maximal tenderness in the RLQ essentially excluded appendicitis in our derivation and validation groups. Our opinion is that the recursive partitioning model has the advantage of simplicity and therefore should be the model of choice. Pediatric patients who have suspected appendicitis and are low risk by either model should be considered for observation rather than undergo CT scan or operative care.
We thank Robert Wright, MD, MPH, and Elyse Olshen, MD, MPH, for statistical assistance and thoughtful reviews of this manuscript. We also thank the PEM attending physicians, fellows, surgeons, and radiologists at Children's Hospital Boston for assistance with this study.
- Accepted May 2, 2005.
- Reprint requests to (A.B.K.) Division of Emergency Medicine, Children's Hospital Boston, 300 Longwood Ave, Boston, MA 02115. E-mail:
No conflict of interest declared.
This study was presented in part at the annual meeting of the Pediatric Academic Societies; May 2, 2004; San Francisco, CA; and the annual meeting of the American Academy of Pediatrics; October 8, 2004; San Francisco, CA.
- ↵Addiss DG, Shaffer N, Fowler BS, Tauxe RV. The epidemiology of appendicitis and appendectomy in the United States. Am J Epidemiol.1990;132 :910– 925
- ↵HCUPnet, Healthcare Cost and Utilization Project. Available at: http://hcup.ahrq.gov/HCUPnet.asp.Accessed August 1, 2004
- ↵Scholer SJ, Pituch K, Orr DP, Dittus RS. Clinical outcomes of children with acute abdominal pain. Pediatrics.1996;98 :680– 685
- Brender JD, Marcuse EK, Koepsell TD, Hatch EI. Childhood appendicitis: factors associated with perforation. Pediatrics.1985;76 :301– 306
- ↵Garcia Pena BM, Cook EF, Mandl KD. Selective imaging strategies for the diagnosis of appendicitis in children. Pediatrics.2004;113 :24– 28
- Pena BM, Taylor GA, Lund DP, Mandl KD. Effect of computed tomography on patient management and costs in children with suspected appendicitis. Pediatrics.1999;104 :440– 446
- ↵Brenner DJ, Elliston CD, Hall EJ, Berdon WE. Estimates of the cancer risks from pediatric CT radiation are not merely theoretical: comment on “point/counterpoint: in x-ray computed tomography, technique factors should be selected appropriate to patient size. against the proposition.” Med Phys.2001;28 :2387– 2388
- ↵Radiation and Pediatric Computed Tomography. Available at: www.nci.nih.gov/cancertopics/causes/radiation-risks-pediatric-CT.Accessed October 22, 2004
- Pena BM, Taylor GA, Fishman SJ, Mandl KD. Costs and effectiveness of ultrasonography and limited computed tomography for diagnosing appendicitis in children. Pediatrics.2000;106 :672– 676
- Pena BM, Taylor GA, Fishman SJ, Mandl KD. Effect of an imaging protocol on clinical outcomes among pediatric patients with appendicitis. Pediatrics.2002;110 :1088– 1093
- ↵Williams N, Kapila L. Acute appendicitis in the preschool child. Arch Dis Child.1991;66 :1270– 1272
- Copyright © 2005 by the American Academy of Pediatrics