SPECIAL ARTICLE |
a Section of Neonatology, Texas Children's Hospital
f Section of Health Services Research, Department of Medicine, Baylor College of Medicine, Houston, Texas
b Houston Center for Quality of Care and Utilization Studies, Houston Veterans Affairs Medical Center, Houston, Texas
c Harvard Neonatal Perinatal Medicine Program, Beth Israel Deaconess Medical Center, Boston, Massachusetts
d California Perinatal Quality Care Collaborative, Palo Alto, California
e Perinatal Epidemiology and Health Outcomes Research Unit, Division of Neonatology, Stanford University School of Medicine, Lucile Packard Children's Hospital, Palo Alto, California
| ABSTRACT |
|---|
|
|
|---|
Key Words: pay-for-performance programs quality improvement
Abbreviations: QIquality improvement IOMInstitute of Medicine OECDOrganisation of Economic Cooperation and Development CPQCCCalifornia Perinatal Quality Care Collaborative
Large deficits in quality of care remain more than half a decade after the Institute of Medicine (IOM) provided a blueprint for improvement.1 In neonatology, there is persistent unexplained variation in health care delivery and outcomes.28 To date, quality improvement (QI) efforts, either locally or as part of collaborative efforts, have had mixed results.917 The broad-based improvement envisioned by health care payers and the IOM has not occurred. (In this article, we use the term "payer" to mean the broad group of employers, purchasers, insurers, and health care plans that pay for health care services directly or indirectly.)
One factor that is receiving increasing attention is a reimbursement system that may actively discourage QI.1 For example, in December 2003, the New York Times described how Intermountain Health Care, a network of 21 hospitals in Utah and Idaho, was punished financially by Medicare for saving lives and cutting costs.18 Reimbursement decreased because better care resulted in lower complication rates. In health care, the financial benefits of QI often accrue primarily to payers and patients and not to providers. Pay-for-performance represents an attempt to correct this imbalance and to provide incentives for quality to providers.19
By paying providers according to the quality of care they deliver, pay-for-performance schemes attempt to align the interests of health care payers, patients, and providers, ensuring that providers act in the other parties' best interest.20,21 Pay-for-performance initiatives provide financial motivation but may also introduce competitive motivational incentives by comparing the performance of providers again each other or against a standard of care (benchmarking). Pay-for-performance programs thus hold promise for QI by generating both intrinsic (motivation) and extrinsic (reputation and financial rewards) performance incentives.2224 Although relatively little evidence for their effectiveness has been accumulated to date, 2 comprehensive reviews of the topic found moderate benefits of pay-for-performance and drew cautiously optimistic conclusions about its potential to improve quality of care.25,26 In one review, 14 of 17 studies showed partial or positive effects on quality of care.26 However, it should be noted that, in some studies, improvement owed more to improved documentation than to actual changes in care delivery.2729 Only 3 studies were carried out in the pediatric population, and all targeted preventive care services in the general pediatric health care delivery setting.26
Despite some ambiguity in early evaluations, the IOM has endorsed ongoing experimentation with pay-for-performance,30 and payers are enthusiastic about its potential to improve the value of health care purchasing.31 There are now >100 active pay-for-performance projects throughout the country.19,24 In addition, legislative initiatives aim to incorporate incentives for quality into Medicare's payment systems.32 Although to our knowledge pay-for-performance approaches have not been applied in the NICU, we think that the NICU is a prime target for payers because of the high cost, available databases, relative strength of research evidence, and, compared with adult settings, low incidence of comorbidities. The latter makes it easier to attribute performance to providers, rather than to patients.
Unfortunately, many pay-for-performance projects are implemented in an uncontrolled manner, making it unclear whether the benefits are truly attributable to the financial incentives.26 Rigorous research designs and methods are necessary to determine whether performance-based payment arrangements result in meaningful QI and are cost-effective. For example, 2 of us (Drs Petersen and Profit) are conducting a prospective, multicenter, cluster-randomized, controlled trial to study the effects of the pay-for-performance approach on quality of care and hypertension control in adults (L.A.P., L. D. Woodard, MD, T. Urech, MPH, et al, unpublished data, 2007). That trial should add to the body of literature on pay-for-performance and shed light on the benefits and costs of different choices in incentive design. It uses physician- and group-level financial incentives, plus audit and feedback, to improve quality of care. More such trials need to be designed to evaluate the effectiveness of pay-for-performance in a variety of care settings and for a spectrum of clinical situations. Our recommendations for implementing quality assessment and financial incentives for future pay-for-performance initiatives in neonatology are described below and summarized in Table 1.
|
| MEASURING QUALITY |
|---|
|
|
|---|
Framework for Measuring Quality
Generally, quality of care is defined within a multidimensional framework. For example, the IOM has suggested that quality of care is a reflection of care in the domains of patient safety, effectiveness, efficiency, patient-centeredness, timeliness, and equity.1 The dimensions of the quality of health care delivered by a NICU may also be described by its physical and organizational composition (structure of care), by the clinical care interactions between patients and providers (process of care), and by patient outcomes, in terms of morbidity, death, and caregiver satisfaction (outcomes of care).33 Measures of structure, process, and outcome have distinct advantages and disadvantages. For example, structural measures (eg, the availability of electronic health records) are easy to obtain and measure but are theoretically distant from the ultimate goal of improving health outcomes. Process measures may be more sensitive to differences in quality of care but require that there be good evidence for a direct link between the process and clinical outcomes. Outcome measures are perhaps of greatest intrinsic value, because they reflect directly what patients and providers truly care about, but they may occur too infrequently to provide statistically meaningful results (eg, death)34 or may occur so far in the future (eg, developmental delay) that data collection efforts become impractical or burdensome.
Ideally, we think that an assessment of quality should incorporate the full range of quality-of-care dimensions, with indicators that are valid, reliable, feasible to collect, and relevant to important domains of care. Quality assessment is a dynamic process and, especially within pay-for-performance schemes, should reinforce providers' control over their performance. Accordingly, indicators should be not only theoretically sound but also actionable; that is, indicators should be responsive to change within a timely period and should be unambiguous with respect to interpretation. Importantly, measures must be standardized and adjusted for clinical risk, and data collection must be adequately simplified to ensure uniformity of definitions.35
Figure 1 presents a proposed framework for neonatal quality measurement. Pay-for-performance programs attempt to measure and reward the quality of the products of the health care delivery system. The outcomes of the health care delivery system are influenced by individual and societal determinants of health,36 as well as the design of the health system.37 The combination of the structure/process/outcome framework of quality with that of the IOM results in a quality-of-care matrix that forms an inclusive framework for measuring quality. In our opinion, this could address some of the shortfalls of focusing on individual measures. Although identifying the specific indicators for each of these domains of quality might prove challenging, this framework provides a guide to practitioners and researchers in an ongoing effort to refine quality measurement. Evidence-based expert consensus38,39 could be used to fill the matrix and to generate measures for quality-monitoring or pay-for-performance initiatives.
|
OECD Guidelines for Constructing Composite Indicators
Briefly, the OECD suggests a 10-step building process)48 (Table 2). At each step, researchers must choose from several available options, depending on the underlying data and the purpose of the composite indicator.
|
Step 2 is measure selection. Importance, accuracy, and feasibility guide the selection of quality-of-care indicators. The medical literature and expert opinion can provide guidance.
Step 3 is initial data analysis. The underlying nature of the data must be explored and appropriate transformations made with regard to directionality of measures, outliers, ceiling effects, and nature of distributions.
Step 4 is imputation of missing data. The impact of missing data on the performance measurement must be examined, because the data may contain significant bias if providers avoid reporting poor outcomes.
Step 5 is normalization of data. For linkage of measures, the measures must be transformed into a common unit of measurement. There are many options for normalization, including ranking, standardization, and distance to a reference.
Step 6 is weighting and aggregation. This is a crucial step in the development of a composite indicator, because the attribution of weights to different measures and their aggregation can have significant influences on performance. The 2 basic approaches used to arrive at subindicator weights include statistical (eg, principal-component analysis, factor analysis, multivariate techniques, and others) and participatory (variations on elicitation of expert opinion) methods. It is important to realize that equal weighting does not imply an absence of weights, because with this approach each subindicator is given a weight of 1. The benefit of the statistical approach includes its relative fairness and freedom from bias in deriving weights based on purely statistical grounds. Its disadvantage is that the weights may not correspond to real-world common sense.
In the aggregation phase, the subindicators are aggregated into a composite indicator. The primary decision involved in choosing an aggregation method is whether NICUs should be allowed to compensate for poor performance in one subindicator with superior performance in others. There are 3 principal choices, namely, full compensation (linear additive aggregation), partial compensation (geometric or multiplicative aggregation), and no compensation (noncompensatory methods). Each of these choices has benefits and drawbacks.
Step 7 is uncertainty and sensitivity analysis. There are 2 primary sources of error in performance measurement, that is, the effect of the error contained within the underlying data (uncertainty analysis) and the impact of different choices in constructing the composite indicator (sensitivity analysis). These error sources can be combined and their effect displayed in a higher-order Monte Carlo experiment.
Step 8 is linkage to other variables. Composite indicators for some fields of medicine might be combined with those in others, potentially yielding greater insights across care settings or longitudinally. Entire networks of care could be compared with respect to their performance in managing acute and chronic care (ie, combining NICU care with follow-up care).
Step 9 is deconstruction of the composite indicator. Both summary scores and performance on individual measures can be displayed to guide health policy-making and future research. This allows stakeholders to identify areas of weakness and strengths.
Step 10 is presentation and dissemination. Results can be presented in user-friendly formats such as charts that include measures of uncertainty (confidence intervals). Electronic publications can link to additional details on individual subindicators.
Measuring Quality in the NICU Setting
Data collection efforts in neonatology are better developed than in many clinical specialties. The Vermont Oxford Network collects validated data from >600 NICUs throughout the world.50 In California, 120 NICUs submit an expanded data set, with core elements identical to those collected by the Vermont Oxford Network, to the California Perinatal Quality Care Collaborative (CPQCC). These data are used to prepare confidential reports for each NICU and to prepare the California Children's Services mandated yearly activity and outcomes report, which CPQCC submits on behalf of requesting NICUs. A quality indicator based on routinely collected data could thus be used for comparative benchmarking efforts involving pay-for-performance programs. We are currently working to develop such an indicator by using the CPQCC database. A possible representation of NICU quality measures within the matrix is given in Fig 2.
|
The first challenge involves the diversity of populations. Pathologic conditions, care practices, and outcomes vary widely for patients in different gestational age groups, requiring in some instances both stratum-specific analyses and individualized quality-of-care measures for specific subpopulations, such as extremely premature infants, infants requiring complex surgery, and infants with congenital anomalies. Rather than attempting to measure care for all groups at once, stakeholders should focus on developing quality measures for patient groups that are commonly represented in NICUs (very low birth weight infants, moderately premature infants, and term infants).
The second challenge involves the limit of viability. There is no consensus regarding the treatment of patients born at gestational ages of <25 weeks.51 This group of patients may require a special set of quality markers that relate more to patient satisfaction with care or documentation of parental education than patient-specific outcome measures.
The third challenge involves patient transfers. It is currently difficult to track patients' hospital stays across multiple institutions of care. This may induce significant bias, because NICUs might transfer their highest-risk patients to other hospitals.52 Another source of bias stems from the differing availability of back-transports across NICUs. Lengths of stays are increased in NICUs where opportunities for back-transport are limited. Evaluations of quality therefore need to account for transfer bias. Risk adjustment should also account for the location of birth (inborn/outborn). Ultimately, improvements in patient tracking may eliminate this problem.
| DESIGN OF THE FINANCIAL INCENTIVE |
|---|
|
|
|---|
Incentive Structure
Incentive structure influences how rewards are allocated across providers, whether providers compete for bonuses, and whether targets are based on improvement or just good performance. Competitive bonus programs provide an incentive to improve performance as providers compete for rewards and reputation. However, most of the payouts go to the top-performing providers, with little incentive for bottom-performing providers to improve.53 In noncompetitive programs, all providers are rewarded for reaching fixed performance targets. Targets based on QI rather than absolute quality provide greater incentives for those with low baseline quality, although most of the payouts again go to the high performers. Our preferred approach would be a combination of methods in which providers are rewarded for achieving the desired result in any given measure of care but also are rewarded for overall performance and/or improvement on a composite measure of care.
Incentive Recipient
The more direct the connection between the incentive and the person delivering the care, the greater is the effect of the incentive. In the NICU setting, however, care practices and results rarely can be attributed to a single provider but rather are a reflection of a team effort that includes a group of caregivers (eg, physicians, nurses, respiratory therapists, and nutritionists). In addition, some patients require multidisciplinary care from surgeons, cardiologists, and other providers. Therefore, in the NICU, a group or hospital incentive is a more-practical design choice. Any financial reward to providers would be redistributed within the group. This design would also foster a collaborative approach to patient care, because all caregivers would participate in the benefits of the reward, although a potential problem with this approach is "free-riding" by providers who contribute relatively little to care improvement within the group.
Incentive Amount
The amount of money needed to change provider effort is variable and is determined by the provider's marginal utility for the extra income. This depends not only on monetary factors (household income) but also on nonmonetary factors (personal ethics, normative professional practices, regulatory control, and clinical uncertainty). An amount too small is unlikely to induce a change in behavior; an amount too large may induce undesirable provider behavior. A survey of health maintenance organization managers indicated that a bonus of at least 5% of a physician's capitation income would be required to influence provider behavior.54
Payment Structure
The principle choice is whether to reward providers through an intermittent bonus or an increase in the fee-for-service schedule. Economic theory suggests that providers would respond most to incentives if they are rewarded every time they do the right thing or achieve a desirable outcome. However, the psychological literature suggests that larger intermittent bonuses for achieving a benchmark of care may create a more powerful motivational effect than regular small payment increases. There is insufficient literature to make a definitive judgment with regard to either method.27,28 For practical reasons related to data collection, we recommend a yearly bonus.
Payment Frequency
Practical impediments to rewarding providers with frequent timely payments to sustain momentum for improvement include the need to collect and to evaluate data. In addition, the frequency depends on the interval of measurement that allows for a meaningful interpretation of change. Specifically, if a measured variable occurs relatively infrequently, then it will take a longer time before a true performance assessment can be obtained. In the NICU setting, at a minimum, yearly feedback would be desirable.
| BRINGING PAY-FOR-PERFORMANCE TO THE NICU |
|---|
|
|
|---|
| FOOTNOTES |
|---|
Address correspondence to Jochen Profit, MD, MPH, Houston Center for Quality of Care and Utilization Studies, VA HSR&D(152), 2002 Holcombe Blvd, Houston, TX 77030. E-mail: profit{at}bcm.edu
The authors have indicated they have no financial relationships relevant to this article to disclose.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
A. T. Chien and R. A. Dudley Pay-for-Performance in Pediatrics: Proceed With Caution Pediatrics, July 1, 2007; 120(1): 186 - 188. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||