BACKGROUND AND OBJECTIVE: There is a lack of broadly applicable measures for risk adjustment in pediatric surgical patients necessary for improving outcomes and patient safety. Our objective was to develop a risk stratification model that predicts mortality after surgical operations in children.
METHODS: The model was created by using inpatient databases from 1988 to 2006. Patients younger than 18 years who underwent an inpatient surgical procedure as identified by using the International Classification of Diseases, Ninth Revision, Clinical Modification, coding were included. A 7-point scale was developed with 70 variables selected for their predictive value for mortality using multivariate analysis. This model was evaluated with receiver operating characteristic (ROC) analysis and compared with the Charlson Comorbidity Index (CCI) in two separate validation data sets.
RESULTS: A total of 2 087 915 patients were identified in the training data set. Generated risk scores positively correlated with inpatient mortality. In the training data set, the ROC was 0.949 (95% confidence interval [CI]: 0.947, 0.950). In the first validation data set, the ROC was 0.959 (95% CI: 0.952, 0.967) compared with the CCI ROC of 0.596 (95% CI: 0.575, 0.616). In the second validation data set, the ROC was 0.901 (95% CI: 0.885, 0.917) and the CCI ROC was 0.587 (95% CI: 0.562, 0.611).
CONCLUSIONS: This study depicts creation of a broadly applicable model for risk adjustment that predicts inpatient mortality with more reliability than current risk indexes in pediatric surgical patients. This risk index will allow comorbidity-adjusted outcomes broadly in pediatric surgery.
- AHRQ —
- Agency for Healthcare Research and Quality
- AUC —
- area under the curve
- CCI —
- Charlson Comorbidity Index
- CCS —
- clinical classifications software
- CI —
- confidence interval
- ICD-9-CM —
- International Classification of Diseases, Ninth Revision, Clinical Modification
- KID —
- Kids’ Inpatient Database
- NIS —
- National Inpatient Sample
- OSHPD —
- California Office of Statewide Health Planning and Development
- ROC —
- receiver operating characteristic
What’s Known on This Subject:
Current measures of risk stratification in the pediatric surgical literature are specialty specific. Although these risk scores have been validated as useful predictors of adverse outcomes, no measures currently exist to assess the full spectrum of pediatric surgery.
What This Study Adds:
Our study generates a multispecialty mortality risk score for pediatric surgical patients that can be used by physicians to identify high-risk patients as well as provide a measure of risk adjustment for surgical outcomes.
An increasing recognition of the importance of surgical outcomes and their impact on patient safety and quality of care has resulted in numerous programs that capture comprehensive patient data. Identifying areas of quality improvement requires clear comparisons of performance across different institutions. Consequently, effective programs must account for a wide range of factors that may influence a patient’s clinical course to generate equitable comparisons. Outcomes studies commonly adjust for readily available patient characteristics such as age, gender, race, and insurance status. Provider- and hospital-specific factors, such as surgeon and hospital procedure volume, have been shown to correlate with surgical outcomes, but predictions vary with the complexity of the procedure performed.1 Other patient risk factors, particularly preexisting comorbidities, must be integrated into predictive models to provide reliable comparisons.
Comorbid conditions present a significant challenge in interpreting the results of outcomes studies. Comorbidities influence not only the outcome in question but also the type of treatment/intervention given, prognosis, and detection, thus potentially obscuring the relationship between the intervention under study and the index disease. Measures to account for comorbidities can adjust for confounding, identify effect modification, and improve statistical analysis.2 One such method is the use of comorbidity indexes, which are tools used to stratify patients by similar clinical risks, thus accounting for severity of illnesses and overall status of the patient when making comparisons. A comorbidity index condenses a patient’s current comorbid conditions into a single numeric score by using the number and severity of coexistent conditions.3 There are a number of validated and reliable indexes used to account for patient comorbidities in the adult literature; however, specific methodologies to adjust for comorbidities are currently limited in the pediatric surgical literature.
Current measures of risk stratification in the pediatric surgical literature are specialty specific. The Risk Adjustment for Congenital Heart Surgery (RACHS-1) and the Aristotle complexity score have been reported.4,5 The American College of Surgeons National Trauma Data Bank and the National Pediatric Trauma Registry compile information regarding pediatric trauma care, and Tepas et al have devised risk models to predict pediatric trauma outcomes.6–8 Although these specialty-specific risk scores have been validated as useful predictors of adverse outcomes, no measures currently exist to assess the full spectrum of pediatric surgery.
Our study generated a multispecialty risk score for pediatric surgical patients. To do so, we screened 93 million admissions to identify 2 169 989 surgical admissions in pediatric patients (<18 years of age). A retrospective analysis was performed to provide comprehensive data on the relationship between preoperative comorbidities and postoperative mortality.
This study was deemed exempt from full committee review by the Johns Hopkins Institutional Review Board.
This is a retrospective analysis that used a nonoverlapping combination of the Nationwide Inpatient Sample (NIS) and Kids’ Inpatient Database (KID) from 1998 to 2005 for the development set and KID 2006 for the first validation set. Both databases are publicly available and released by the Agency for Healthcare Research and Quality (AHRQ) under the Health Cost and Utilization Project. The NIS is an all-payer database that annually contains information from up to 8 million inpatient discharges from ∼1000 hospitals across the United States. Hospitals are sampled to represent a 20% stratified sample of all community hospitals.9 The KID contains a sample of pediatric (age ≤20 years) discharges from all community, nonrehabilitation hospitals in states that participate in the Health Cost and Utilization Project. Unlike the NIS, which samples at the hospital level, the KID samples patient discharges that are then weighted to obtain national estimates.10 Given its more robust annual pediatric sample size, the KID was used for years when it was available (1997, 2000, 2003, and 2006), whereas the NIS was used for all other years.
The second validation data set was obtained from patient discharge data made available by the California Office of Statewide Health Planning and Development (OSHPD) from 2005 to 2007. The OSHPD provides public data sets of inpatient data containing a record for each inpatient discharged from a California-licensed hospital. These data contain designations of “present on admission” for unique diagnosis codes to distinguish conditions obtained during that hospital admission from conditions existing before that admission. This feature allows improved validation of the index because it will only include those conditions present on admission, effectively excluding complications related to care.
The inclusion criterion was as follows: patients younger than 18 years who had undergone an inpatient operative procedure as identified by using International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM), coding.
The outcome measured was in-hospital mortality. Independent variables used in the multivariate analysis included age in single-year categories, gender, and 69 medical comorbidities as defined by single-level Clinical Classifications Software (CCS), generated by the AHRQ.11 The CCS collapses the multitude of codes in ICD-9-CM (>13 600 diagnosis codes) into a smaller number of clinically meaningful categories. The single-level CCS scheme aggregates illnesses and conditions into 285 mutually exclusive, nonhierarchical categories. CCS categories were dropped or combined either because they were not statistically significant in the multiple regression models or because they were judged to not be clinically meaningful, leaving 69 diagnostic categories (15 of which were modified by our team) plus patient gender. Age <2 years was associated with elevated risk of mortality and was thus included again as a specific indicator in the final model.
The derivation of the point values used in our scale was based on rounding of the unexponentiated coefficients from the multiple logistic regression model. Positive coefficients were rounded to the nearest integers, whereas negative coefficients were replaced with zero. Factors that had no impact on patient risk or were protective were given scores of zero. The impact of replacing negative coefficients with zeros was minimal. Sensitivity analyses revealed no qualitative differences between models that included them as negative coefficients (the lowest of which was only –1) or as zeros. The final scale included 7 possible categories equaling the summation of all of the point values associated with each of the diagnostic categories present in a patient.
Model characteristics were evaluated with receiver operating characteristic (ROC) c-statistics analysis. The c-statistics was compared between the final logistic regression models on the development data set and the derived point scale on both the development and validation data sets. The first comparison determined whether the conversion to a point scale affected the diagnostic index properties. The second and third comparisons were performed for the purpose of validation. Two validations of the developed model were performed in this study. The first validation was based on applying the derived point values for each diagnostic category to KID data from 2006. The second validation used the OHSPD database from 2005 to 2007. A Hosmer-Lemeshow test was not performed because the use of a very large training set (>2 million admissions) will preclude its interpretation.
Subset analysis was performed by applying the 7-point scale to patients from the KID 2006 data by age groups looking at neonates (≤28 days), infants ≤1 year of age, and toddlers ≤2 years of age. The Charlson Comorbidity Index (CCI) was also applied to the validation data sets, and the resulting ROCs were compared with the developed model. The CCI is a standardized measure of patient comorbidities determined by weighted scoring of comorbidities including cardiac, vascular, pulmonary, neurologic, endocrine, renal, hepatic, gastrointestinal, immune diseases, and any documented history of cancer.1 It was adapted for use on an administrative data set by Romano et al.12
Statistical analysis was performed by using Stata/MP version 10.0 (StataCorp LP, College Station, TX). A P value <.05 was considered to be statistically significant.
A total of 2 087 915 patients were identified in the development (training) set, 82 074 patients were identified in the first validation data set (KID 2006), and 49 209 patients were identified in the second validation data set (OHSPD 2005–2007). Patient demographic characteristics, including age, gender, race, plus admission data for these 3 data sets, are presented in Table 1. Gender was similar between all 3 data sets. The median age was younger in the OHSPD data compared with the training and first validation data set (6 vs 10 years). An increase in the percentage of urban teaching hospitals (68.5% vs 60.7%) was noted in the KID 2006 data compared with the training data set (not available in the OHSPD). The percentages of admissions through the emergency department were 33.6% for the training data set, 38.3% for the KID 2006 data set, and 32.1% for the OHSPD data set. The overall death rate in the training data set (0.85%) was slightly higher than that in the KID 2006 (0.64%) or OHSPD (0.71%) data sets, which is perhaps a reflection of medical advances over time.
The list of comorbidity categories included in the final multivariate logistic regression model in the training data set is shown in Table 2. The distribution of patients across diagnostic categories and the observed mortality in each category are presented, along with the adjusted odds ratios for mortality associated with each category. The assigned point value is then listed for each comorbidity category.
Ultimately, each patient in each data set was individually scored and categorized according to the sum of the point values assigned to his or her diagnostic comorbidities (Table 3). Seven risk categories were evaluated, ranging from 0 to ≥6. An increase in score is generally accompanied by a decrease in the number of patients in that point-value category and an increase in death frequency. For the training data set, a risk score of 0 was associated with a <0.01% mortality rate and a score of ≥6 with a 33.5% mortality rate. In the first validation data set, a score of 0 was also associated with <0.01% mortality and a score ≥6 with 28.3% mortality. In the second validation data set, a score of 0 was associated with <0.01% mortality and a score ≥6 with 24.3% mortality.
The ROC analysis results in this study are shown in Table 4. The area under the curve (AUC) of the original multivariate regression model was 0.955. After conversion to the 7-point scale, the AUC of the index was 0.949. That the 95% confidence interval (CI) of 0.947, 0.950 does not overlap with the original AUC suggests a slight but statistically significant degradation of the diagnostic properties in the point conversion process. In the first validation data set, the 7-category point scale has an AUC value of 0.959 (95% CI: 0.952, 0.967). Application of the CCI to the first validation data set yielded an AUC of 0.596 (95% CI: 0.575, 0.616). These results thus reveal a much higher AUC, and predictive power, for pediatric surgical mortality when using our 7-point model than when using the CCI. When applied to the second validation data set, which accounts for conditions present on admission, the AUC was 0.901 (95% CI: 0.885, 0.917), a slight decrease from the other values, but still with significant predictive value. Application of the CCI to the second validation data set (OHSPD 2005–2007) yielded an AUC of 0.587 (95% CI: 0.562, 0.611), thus revealing the higher predictive ability of the 7-point scale when accounting for variables only present on admission. Table 5 displays the results of applying the 7-category scale to various subgroups that specify age categories (neonates, infants, toddlers). On subset analysis, the AUC of our index continued to have high predictive value, ranging from 0.843 in neonates to 0.927 in toddlers.
We performed multivariate analysis on data from inpatient records for 2 087 915 pediatric surgical patients during the period 1988–2005 to develop a 7-category point scale that reflects the risk of postoperative mortality. As shown by the ROC analysis depicted in Table 4, our surgical risk index has superior predictive ability for mortality compared with the CCI for pediatric surgical patients.
Although pediatric-specific risk scores have been reported in trauma and cardiothoracic surgery individually, our 7-category point scale is the first multispecialty risk model. This risk index will allow for the assessment of comorbidity-adjusted outcomes and risk category–specific mortality rates in all areas of pediatric surgery, which will enable not only clear comparisons of institutional performance, regardless of case-mix differences, but also accurate evaluations of changes in performance over time, or stratification of research subjects for clinical trials, all of which are crucial to improving quality of care and patient safety. Furthermore, the index may also be used to help develop a tool to provide patients’ individualized risk profiles, which would be tailored to each child’s specific comorbidities. This approach could be disease specific. Both such applications require further validation.
Because the model maintained its predictive ability with “present on admission” variables (OSHPD validation), we can infer that it has the potential to reliably identify high-risk pediatric surgical patients in need of special attention. This capability will aid physicians and other health providers in discussing the mortality risks with patients and family members, and in alerting surgical teams of areas requiring increased acuity of care. The surgical risk score can also provide a standardized measure of severity to direct referral of patients to hospitals equipped with the appropriate resources.
The CCI is one of the most commonly used measures of comorbidity in adults, particularly in oncological studies.1 This index has been used extensively in studies with large sample sizes with the use of administrative data to predict outcomes, including mortality, length of stay, and resource usage. However, it is poorly adapted for use in pediatric surgical patients because it is based on data for adult internal medicine patients. As shown in our results, the developed risk index has superior results in predicting mortality in the pediatric surgery population, because this new risk index is specific for pediatric patients and incorporates a significantly larger sample size in its design compared with the CCI.
In evaluating surgical outcomes, the choice of a measurement that accurately reflects quality of care or operative risk is widely debated. Mortality in pediatric surgery is a rare occurrence and its use as a risk-adjusted outcome is controversial because doing so neglects several important factors, such as quality of life, performance of activities of daily living, and the incidence of postoperative complications, which may entail significant long-term morbidity. Nonetheless, inpatient mortality remains an easily obtained measure of operative risk and is listed by the Leapfrog Group, the National Quality Forum, and the AHRQ as a valid indicator of quality of surgical care.13 Thus, risk of mortality is an appropriate measure on which to base our risk index. Nonfatal measures such as functional capacity and quality of life may also be more difficult to reliably measure in a pediatric population with age-limited communicative abilities and rapid developmental changes.14
At the same time, we must consider the limitations inherent to administrative databases that are present in our study. First, ICD-9-CM codes do not capture findings from physical examinations or laboratory tests, and consequently any index developed on the basis of administrative data is limited to diagnosis or procedure codes. In addition, entry of information into these databases is performed by individuals with a wide range of clinical experience, which can result in inconsistent coding. The effects of these changes may be mitigated by their aggregation into broader CCS categories because mapping algorithms typically omit diagnostic codes that are prone to present as surgical complications.15 Validation of the index using the OSHPD database, which contains present on admission variables, addresses the concern for misclassification of surgical errors or hospital-acquired complications to the wrong hospital. Failure to account for conditions present on admission could potentially reward institutions with high complication rates with undeservedly low mortality rates after risk adjustment.15
These limitations notwithstanding, the design of administrative databases for processing claims and billing information imparts certain advantages. Low perioperative mortality is a challenge in developing a risk score for pediatric surgery. Pediatric operations also tend to be performed less frequently, relative to adult procedures. As a result, statistically significant differences in mortality rates across institutions may not be readily observed because the combination of low volume and low frequency limits the use of mortality as a comparative outcome.14,16 Low procedure volume at individual centers and, to a lesser degree, in clinical databases can, however, be offset by the large quantity of data entries present in administrative databases. Furthermore, administrative databases reflect a wide range of hospitals and patient demographic characteristics, in contrast to clinical databases that enroll patients on a voluntary basis at participating hospitals. Thus, the surgical index we have derived should be broadly applicable to all institutions and patient populations.
The index we have developed provides a novel way to measure a surgical patient’s level of severity and can be used by physicians to identify high-risk patients as well as to provide a measure of risk adjustment for surgical outcomes. Our data reveal that this risk score is a viable tool for risk stratification because it is a reliable predictor of inpatient mortality and performs better than its CCI adult counterpart in pediatric surgical patients.
The authors would like to thank the Robert Garrett Fund for Treatment of Children which helped support the design and conduction of this study including the collection, analysis, and interpretation of data, as well as preparation of the manuscript.
- Accepted October 18, 2012.
- Address correspondence to Fizan Abdullah, MD, PhD, Johns Hopkins University School of Medicine, 1800 Orleans St, Room 7337, Baltimore, MD 21287-0005. E-mail:
This work was presented as an oral presentation at the American Pediatric Surgical Association 2011 Annual Meeting; May 22–25, 2011; Palm Desert, CA.
Dr Rhee contributed to statistical analysis, data interpretation, and drafting and revising the manuscript; Dr Salazar analyzed and interpreted the data and contributed to the draft manuscript and revisions; Ms Zhang acquired a portion of the data and performed part of the statistical analysis and interpretation; Ms Jingyan Yang reviewed the data and conducted critical improvements to the analysis after they were suggested by Pediatrics reviewers; Ms Jessica Yang and Drs Papandria and Ortega contributed to the draft manuscript and revisions; Drs Goldin and Rangel contributed as critical reviewers of the manuscript; Dr Chrouser performed part of the analysis and interpretation of the data and made revisions to the manuscript; Dr Chang conceptualized and designed the study, acquired part of the data, performed the statistical analysis and interpretation, and contributed to drafting and revising the manuscript; and Dr Abdullah conceptualized and designed the study, analyzed and interpreted the data, and contributed to drafting and revising the manuscript.
FINANCIAL DISCLOSURE: The authors have no financial relationships relevant to this article to disclose.
FUNDING: No external funding.
- Lacour-Gayet F,
- Clarke D,
- Jacobs J,
- et al.,
- Aristotle Committee
- Tepas JJ III, Leaphart CL, Celso BG, Tuten JD, Pieper P, Ramenofsky ML. Risk stratification simplified: the worst injury predicts mortality for the injured children. J Trauma. 2008;65(6):1258–1261; discussion 1261–1263
- HCUP Nationwide Inpatient Sample (NIS). Healthcare Cost and Utilization Project (HCUP)
- HCUP Kids' Inpatient Database (KID). Healthcare Cost and Utilization Project (HCUP). 1997, 2000, 2003, and 2006. Agency for Healthcare Research and Quality, Rockville, MD. Available at: www.hcup-us.ahrq.gov/kidoverview.jsp. Accessed January 4, 2013
- HCUP Clinical Classifications Software (CCS) for ICD-9-CM. Healthcare Cost and Utilization Project (HCUP)
- ↵Romano PS, Roos LL, Jollis JG. Adapting a clinical comorbidity index for use with ICD-9-CM administrative data: differing perspectives. J Clin Epidemiol. 1993;46(10):1075–1079; discussion 1081–1090
- ↵Welke KF, Diggs BS, Karamlou T, Ungerleider RM. Comparison of pediatric cardiac surgical mortality rates from national administrative data to contemporary clinical standards. Ann Thorac Surg. 2009;87(1):216–222; discussion 222–223
- Copyright © 2013 by the American Academy of Pediatrics