BACKGROUND AND OBJECTIVES: Variability in practice patterns and resource use in the emergency department (ED) can affect costs without affecting outcomes. ED quality measures have not included resource use in relation to ED outcomes and efficiency. Our objectives were to develop a tool for comprehensive physician feedback on practice patterns relative to peers and to study its impact on resource use, quality, and efficiency.
METHODS: We evaluated condition-specific resource use (laboratory tests; imaging; antibiotics, intravenous fluids, and ondansetron; admission) by physicians at 2 tertiary pediatric EDs for 4 common conditions (fever, head injury, respiratory illness, gastroenteritis). Resources used, ED length of stay (efficiency measure), and 72-hour return to ED (return rate [RR]) (balancing measure) were reported on scorecards with boxplots showing physicians their practice relative to peers. Quarterly scorecards were distributed for baseline (preintervention, July 2009–August 2010) and postintervention (September 2010–December 2011). Preintervention, postintervention, and trend analyses were performed.
RESULTS: In 51 450 patient visits (24 834 preintervention, 26 616 postintervention) seen by 96 physicians, we observed reduced postintervention use of abdominal and pelvic and head computed tomography scans, chest radiographs, intravenous antibiotics, and ondansetron (P < .01 for all). Hospital admissions decreased from 7.4% to 6.7% (P = .002), length of stay from 112 to 108 minutes (P < .001), and RR from 2.2% to 2.0%. Trends for use of laboratory tests and intravenous antibiotics showed significant reduction (P < .001 and P < .05, respectively); admission trends increased, and trends for use of computed tomography scans and plain abdominal radiographs showed no change.
CONCLUSIONS: Physician feedback on practice patterns relative to peers results in reduction in resource use for several common ED conditions without adversely affecting ED efficiency or quality of care.
- CT —
- computed tomography
- ED —
- emergency department
- EMR —
- electronic medical record
- IV —
- LOS —
- length of stay
There has been considerable focus on overall resource use and cost of care in our nation’s emergency departments (EDs), particularly for nonemergent conditions that often make up a large share of the case mix in the typical ED. Additionally, wide variation in practice and resource use that cannot be explained by patient-related factors has been demonstrated in both adults and children, including emergency medicine patients.1–12 Excessive use of resources in health care has not been found to improve quality or outcomes, but it does affect cost.11,13–16 The recent Choosing Wisely initiative encourages every specialty to consider reducing use of tests and procedures that are often unnecessary and sometimes can be harmful.17
Traditional ED quality measures have included measures of timeliness such as length of stay, time to antibiotics, boarder time, safety measures such as errors, hand-washing, and measures of patient-centeredness such as left-without-being-seen rates and patient satisfaction. Unexpected return to ED soon after an initial ED visit is a measure of effectiveness of care that is monitored in most EDs. Recent studies have suggested the importance of measuring efficient use of resources for common pediatric ED conditions.18,19 Efforts to streamline resource use by standardizing practice through evidence-based guidelines have been ongoing, but significant degrees of variation in practice remain.4,5,19 Current ED quality measures do not include measures of efficient use of resources or how they correlate with ED outcomes. In recently proposed indicators of quality in pediatric EDs, none of the 62 evaluated unnecessary tests.20 A lack of system incentives such as quality measures has been noted as one of the reasons for overdiagnosis that can result in a cascade of effects that do not always improve outcomes.21
Audit and feedback of physician practice has been used as a tool to improve resource use and care efficiency; when physicians are provided with data on how they practice, they may be more likely to consider practice changes.22–24 Although insurers have been profiling providers’ resource use patterns, such data are often not available to individual clinicians, usually are not acuity-adjusted, and do not include a combination of metrics that balance resource use and outcome measures. Balanced scorecards have been used in health care to provide a comprehensive picture of performance at the organizational level but have not been studied at the individual provider level.25–28 The impact of providing physicians with comprehensive and balanced feedback on their practice patterns relative to peers in the same setting is not known. In this quality improvement initiative, our objectives were to develop a tool for comprehensive feedback to ED physicians on their practice patterns relative to peers and to evaluate the impact of such a physician feedback tool on ED resource use and associated ED quality measures, ED length of stay (efficiency measure), and return to ED (balancing measure).
Setting and Scorecard Development
The study was conducted at a large pediatric health care system’s 2 tertiary-level EDs with a combined annual census of >150 000 pediatric visits. Both sites are staffed by pediatric emergency medicine physicians and urgent care pediatricians and have electronic medical records (EMRs). We have previously documented large variation in practice in the ED even after adjusting for patient (eg, acuity, age) and temporal factors.4 We used this persistent variation in practice to highlight individual physician performance relative to the range of practice of their peers. We did this by creating a comprehensive and balanced scorecard showing physicians their resource use and ED quality metrics, relative to their peers, for 4 common ED conditions. These conditions were chosen because they are some of the most common presentations to pediatric EDs. Institutional guidelines for the management of fever have been in place since 2004 and for the 2 most common respiratory illnesses (asthma and bronchiolitis) since 2006. Guidelines for the management of mild head injury were implemented in mid-2011, and there are no institutional guidelines for the management of children with gastroenteritis-like illness. Although guidelines can influence practice patterns, such change takes time, especially if not associated with measurement of compliance with guideline recommendations. During our study period, there were no new efforts to reinforce compliance to guidelines.
Scorecard Inclusion Criteria
Four common ED conditions were included in the scorecard. To avoid bias based on final diagnosis, we based inclusion criteria on patient presenting complaints. Professional coders classified patient chief complaints into “admitting” diagnoses using International Classification of Diseases, Ninth Revision, Clinical Modification codes. The 4 conditions and the corresponding chief complaint International Classification of Diseases, Ninth Revision, Clinical Modification codes were Fever unspecified (780.60) (age >2 months only because infants <2 months old receive routine screening tests and are often admitted); Head Injury unspecified (959.01) (age >3 months because institutional guidelines apply only to infants >3 months old); Gastroenteritis-like symptoms: Vomiting alone (787.03), Persistent Vomiting (536.2), Diarrhea (787.91), Dehydration (276.51); and Respiratory illness: Other Dyspnea and Respiratory Abnormality (786.09), Cough (786.2), Wheezing (786.07), Unspecified Asthma (493.90), with exacerbation (493.92), with status asthmaticus (493.91).
Acuity Adjustment and Peer Comparisons
Studies on ED provider feedback have underscored the importance of adjusting for acuity, diagnosis, and patient outcomes, and they recommend making comparisons of resource use rates and outcomes for individual physicians against peer-based norms.24 In our study, we achieved acuity adjustment by including only patients in Emergency Severity Index category 3 (midacuity).29,30 The potential for practice variation is highest for these midacuity patients, and the 4 conditions included in the scorecard represent nearly 40% of all midacuity patients seen in our ED. Peer group performance was shown on the scorecard as a standard boxplot with 25th, 50th, and 75th percentiles, with dashes for 1.5 interquartile ranges. A bold diamond indicated an individual physician’s performance against the peer group’s performance shown on the boxplot (Fig 1).
Quality Measures Included in Scorecard
Three measures of ED quality of care were included in the scorecard: resource use, ED length of stay, and 72-hour return rate (Table 1). Resource use, a process measure, was monitored specific to each condition. Additionally, because EDs account for >50% of hospital admissions, and arguably this is one of the most expensive resource use decisions made by an ED physician and is considered discretionary, admission rate to the hospital was measured for each of the 4 conditions.31 ED length of stay (LOS; measured as time from physician assuming care of the patient until exit from the ED) was included as a timeliness measure. Rate of 72-hour return to the ED for the same condition was included as a balancing quality measure. The scorecard also included the total number of patients seen by each physician during the reporting period.
Patients who left without being seen were excluded. Performance data of physicians who saw <10 patients during a reporting period in any condition were not included in the peer group (boxplot) calculation for that condition, to prevent proportions with low denominators from unduly influencing percentiles. Furthermore, physicians who saw <25 patients during either the preintervention or postintervention phase were excluded from analysis because these physicians were not regular ED physicians, or they left the practice at some point, and therefore did not meaningfully contribute to changes in practice patterns.
Physical attribution was based on the attending physician assigned to the patient during the ED visit. When there was transfer of care during the ED visit, resource use decisions were assigned to the first physician, whereas the disposition decision and 72-hour return were assigned to the second (dispositioning) physician.
The intervention (feedback via scorecards) began in September 1, 2010. Feedback was provided through 3 related interventions. The first was quarterly distribution of scorecards electronically. The first scorecard was for the reporting period July 2009 to June 2010, and subsequent scorecards were distributed quarterly, with each reporting quarter representing a previous 12-month rolling average. The second intervention was presentation of scorecards for educational purposes at practice meetings. Third, physician directors of both EDs regularly reviewed with physicians their performance on a 1-on-1 basis, particularly with physicians who had outlier practice patterns. The 14-month period from July 2009 to August 2010 (just before the first scorecard was distributed) was the preintervention phase, and the 16-month period from September 2010 to December 2011 was the postintervention phase.
Data were obtained from EMRs and administrative data that are stored in an institutional data warehouse. The EMRs have electronic signature capture that allows for accurate physician attribution; computerized physician order entry ensured that resource use was accurately assigned to the ordering physician. Data were aggregated at the level of individual providers, who were identified by blinded codes not accessible to investigators.
To assess the effect of this scorecard intervention, we conceived 2 complementary statistical analyses: compare the aggregate group-level difference in resource use before and after the intervention, and compare physician-level serial trends before and after the intervention to evaluate longer-term effects of the intervention. The former analysis allows comparison of aggregate group-level averages for 11 resources and outcomes of interest before and after the intervention. For this analysis, we used χ2 tests (for categorical data), 2-sample t tests (for normally distributed continuous data), and Wilcoxon rank-sum tests (for skewed continuous data, eg, LOS). The latter analysis is a secondary analysis of findings to assess the trends both before and after intervention and to determine whether there are statistical differences in these 2 trends. For this analysis, we used piecewise (generalized) linear mixed models to model the scorecard outcomes as a function of each resource or outcome on the scorecard while accounting for physician-level clustering and effect of time. Each month, each physician was a repeated measure, and each physician started at his or her own baseline. A Poisson distribution was used to model responses that are counts. The effect of the intervention was measured by the change in the slope of the trend lines before and after the intervention, showing change in monthly trends over time (as opposed to change in practice at a single point in time), reflecting sustained effect. Results were noted in terms of marginal effects (eg, the change in the probability that a patient receives a test before and after intervention). We performed all statistical analyses on SAS (SAS Institute, Inc, Cary, NC) statistical software.32
The institutional review board exempted this study from review because only aggregate data were used, with no physician or patient identifiers and no patient-level intervention.
During the study period from July 2009 to December 2011, a total of 336 294 patients were seen in the EDs, of whom 128 691 were in Emergency Severity Index 3 acuity. Of these, 51 450 patients met inclusion criteria (preintervention, 24 834; postintervention, 26 616). Ninety-six physicians saw these patients, with a mean of 536 patients per physician. Figure 2 shows a typical scorecard of a physician during the study period for all 4 conditions.
Table 2 shows the overall preintervention–postintervention results for the various measures reported on scorecards. Resource categories shown reflect only those that were included in the quality improvement initiative and reported to providers via the scorecard. Overall, statistically significant reduction was noted in use of abdominal and pelvic computed tomography (CT) scans, head CT scans, chest radiographs, intravenous (IV) antibiotics, and IV ondansetron (P < .01 for all). Hospital admission rates decreased from 7.4% to 6.7% (P = .002). The median ED LOS for the 4 included conditions decreased from 112 minutes to 108 minutes (P < .001). The 72-hour return rate changed from 2.2% to 2.0%; though not statistically significant, the rate did not increase as use of tests and therapies during the initial visit decreased.
Table 3 summarizes the trends over time for significant findings in Table 2. Statistically significant decrease in trends over time was noted for use of laboratory tests (P < .001) and for use of IV antibiotics (P = .02); admission trends showed an increase (P = .02). Other resource categories did not show statistically significant change in trends over time during the study period. Figure 3 shows trends over time as monthly estimates of rates of laboratory tests performed, IV antibiotic use, hospital admissions, IV ondansetron use, abdominal CT scans, and abdominal radiographs performed before and after the introduction of physician-level scorecards.
Our study shows reduction in resource use for several commonly seen conditions in the pediatric ED after physicians were provided with feedback on practice patterns, including resource use and quality metrics, relative to their peers. There was a small improvement in efficiency (LOS), and reduced resource use did not adversely affect quality of care (return rate). However, results were not consistent, and some trends predated the intervention.
Traditional ED quality measures include operational metrics such as turnaround time and LOS, timeliness measures such as time to antibiotics in newborns or patients with sickle cell disease and fever, and outcome measures such as patient satisfaction and left-without-being-seen rates.33–36 Many of these measures do not reflect physician-related clinical decision-making in the care provided to the patient and are often not directly controlled by ED physicians. Stang et al20 recently developed evidence-based quality indicators for high-acuity pediatric conditions that can be applied to ED settings where children are seen. Most of the measures they suggest are efficiency measures.
Our primary goal was to study whether providing feedback to ED physicians, on factors that they control (testing and treatment decisions for common midacuity presentations for which practice guidelines are available), can affect resource use. We have previously shown wide variation in physician use of common resources in our ED; higher resource use was associated with greater LOS but did not reduce return to ED.4 In a recent study, Kharbanda et al14 observed variation in quality measures for patients presenting to 21 US pediatric EDs with common conditions and noted that higher costs (reflective of overall resource use) were not associated with lower hospitalization or ED revisit rates. Coon et al21 noted that practice variation may be an important indicator for overdiagnosis, which can lead to a number of downstream effects that do not improve patient outcomes but can cause harm. In our current study, we took advantage of variation in practice at the local level to highlight outlier practice patterns in both directions of high and low resource use. Such benchmarking of providers relative to peers can highlight opportunities for improvement; when it is done at the local level, it is likely to be more meaningful to clinicians practicing in the same setting. A unique feature of our scorecard is that it is balanced and comprehensive, not only providing data on resource use but also providing return rates as a balancing measure and LOS as an efficiency measure. We also adjusted for both acuity and patient complaint, and we used common ED conditions for which institutional guidelines were available to inform practice. We used 2 separate methods of analysis: an unadjusted preintervention–postintervention analysis to evaluate aggregate group-level differences in practice patterns and a rigorous, adjusted model to evaluate changes in true trends at the individual physician level. Whereas the former analysis can indicate whether overall resource use differed in the 2 time periods, the latter analysis can identify ongoing sustained changes in trends of practice. The latter analysis also controls for preexisting trends (eg, if resource use was already decreasing in the preintervention period) and is more robust in adjusting for physician-level random effects and repeated measures over time.
After providing feedback for just over 15 months, we observed overall reduction in the use of several resources by using an aggregate comparison between the 2 time periods. Trend analysis showed similar reduction in resource use over time for some but not all of these categories. This reduction may have been caused by several factors. Some trends reversed in the preintervention and postintervention period; for example, trends for laboratories showed increasing use before the intervention and decreasing use after (Fig 3). Although overall use of laboratory tests did not change during the study, the change in trends reflects statistically significant change in practice (reduction in use) after the intervention (Table 3). Trends such as use of IV ondansetron and abdominal and pelvic CT scans predated the intervention (Fig 3); some of these may have resulted from overall awareness of issues related to costs of IV medication, radiation exposure of CT scans, and the like. Reduction in use of CT scans for minor head injury during 2005 to 2009 was noted in a recent study by Mannix et al.11 Our secondary analysis of trends helps elucidate long-term changes in behavior at the individual physician level and sustainability of practice changes for resources whose use is already changing. We conjecture that our intervention kept the momentum of change moving in the desired direction for most but not all resource use. For example, for admission to hospital we noted a true discrepancy between the statistically significant decrease in overall admission rates in the postintervention period and the statistically significant increase in trends in admission rates. Although it is conceivable that this increase in admission rate trends may be related to lower resource use, the decrease in overall admission rates does not support that possibility. Use of abdominal radiographs was the only resource category that showed an actual increase in use after the intervention, with a somewhat increasing trend in the postintervention period (Fig 3). Although neither the increase in overall use of abdominal radiographs nor the increase in trends is statistically significant, this change may be related to a reduction in use of abdominal CT scans, with physicians replacing 1 imaging modality with another.
Use of IV antibiotics showed a significant decrease after the intervention, both in overall rate and in significant change in trends (Tables 2 and 3, Fig 3). In these common, nonsevere pediatric conditions, use of antibiotics is often not indicated, and use of IV antibiotics may be a cautionary approach of some providers and therefore amenable to change in practice. Decreased use of broad-spectrum IV antibiotics would have additional downstream benefits including reducing costs, side effects, and resistance.
A limitation of our study is that it is a single-center study and may not be generalizable to other settings. Given the increasing use of EMRs nationally, and of standard systems of triage acuity assignment that can be used for severity adjustment, along with nearly universal tracking of return rates and LOS in most EDs, we think scorecards similar to ours can be easily adapted for use in many EDs. Additionally, trends for resource use in some categories predated the intervention, making interpretation difficult. We did not make any other changes in the ED during the study period that would affect resource use. The EDs did get a new EMR in mid-July 2011; the only foreseeable impact of that change pertaining to this study would be an increase in LOS as providers learned a new system. We saw an increase in overall LOS for a few months after implementation of the new EMR (data available); however, over the entire study period, the LOS specific to the patients included in the scorecards showed a decrease.
The modest and inconsistent effects of the scorecards in our study underscore the difficulty of changing physician behavior. It remains to be seen how other factors, such as monetary incentives and medicolegal issues, none of which changed during this study, can influence resource use in addition to comprehensive, balanced feedback to providers.
Our study shows reductions in resource use for several commonly seen conditions in the pediatric ED after we provided ED physicians with feedback on practice patterns, including resource use and quality metrics, relative to peers. Reduced resource use did not adversely affect ED efficiency (LOS) or quality of care (return rate).
Feedback on practice patterns relative to peers can influence provider practice for patients outside the middle acuity range and the 4 conditions we studied, thus having a broader impact. Also, feedback to physicians on clinical factors that they can control, and for which there is evidence to base practice on, may be more meaningful than quality measures that focus primarily on operational metrics. Such performance measurement can be used for activities such as the Joint Commission’s Ongoing Professional Practice Evaluation. With additional refinement, comprehensive, severity-adjusted performance measures that encompass both quality and resource use can be used to measure value at the individual provider level and to identify high-value providers.
We thank Robert Massey and Seton McRae for data acquisition and management.
- Accepted March 6, 2015.
- Address correspondence to Shabnam Jain, MD, MPH, Emory University, Children’s Healthcare of Atlanta, Pediatric Emergency Medicine, 1645 Tullie Circle NE, Atlanta, GA 30329. E-mail:
Dr Jain conceptualized the study, designed the scorecards, interpreted the data, and drafted the initial manuscript; Dr Frank assisted with scorecard implementation and data review and revised the manuscript; Ms McCormick assisted with data analysis and revised the manuscript; Mr Wu worked on data analysis, generated the summary tables and graphs, and revised the manuscript; Dr Johnson analyzed the data and revised the manuscript; and all authors approved the final manuscript as submitted.
FINANCIAL DISCLOSURE: The authors have indicated they have no financial relationships relevant to this article to disclose.
FUNDING: Partially funded by the Children’s Healthcare of Atlanta Research Fund.
POTENTIAL CONFLICT OF INTEREST: The authors have indicated they have no potential conflicts of interest to disclose.
- Goldman RD,
- Scolnik D,
- Chauvin-Kimoff L,
- et al.
- Hampers LC,
- Faries SG
- Lantos JD,
- Meadow W
- Wennberg JE. Variation in use of Medicare services in regions and selected academic medical centers: is more better? 2005. Available at: http://www.commonwealthfund.org/publications/fund-reports/2005/dec/variation-in-use-of-medicare-services-among-regions-and-selected-academic-medical-centers--is-more-b. Accessed October 28, 2010
- Florin TA,
- French B,
- Zorc JJ,
- Alpern ER,
- Shah SS
- Knapp JF,
- Simon SD,
- Sharma V
- ↵Choosing Wisely: an initiative of the ABIM Foundation. 2013. Available at: www.choosingwisely.org. Accessed March 15, 2014
- ↵Guttmann A, Razzaq A, Lindsay P, Zagorski B, Anderson GM. Development of measures of the quality of emergency department care for children using a structured panel process. Pediatrics. 2006;118(1):114–123
- ↵Knapp JF, Simon SD, Sharma V. Quality of care for common pediatric respiratory illnesses in United States emergency departments: analysis of 2005 National Hospital Ambulatory Medical Care Survey Data. Pediatrics. 2008;122(6):1165–1170
- Stang AS,
- Straus SE,
- Crotts J,
- Johnson DW,
- Guttmann A
- Coon ER,
- Quinonez RA,
- Moyer VA,
- Schroeder AR
- Jamtvedt G,
- Young JM,
- Kristoffersen DT,
- O’Brien MA,
- Oxman AD
- Meliones JN,
- Alton M,
- Mericle J,
- et al.
- Gilboy N,
- Tanabe T,
- Travers D,
- Rosenau AM
- Pitts S,
- Niska R,
- Xu J,
- Burt C.
- SAS Institute Inc
- Welch SJ,
- Asplin BR,
- Stone-Griffith S,
- Davidson SJ,
- Augustine J,
- Schuur J,
- Emergency Department Benchmarking Alliance
- Welch SJ,
- Stone-Griffith S,
- Asplin B,
- Davidson SJ,
- Augustine J,
- Schuur JD,
- Second Performance Measures and Benchmarking Summit,
- Emergency Department Benchmarking Alliance
- Copyright © 2015 by the American Academy of Pediatrics