BACKGROUND: Consumers rarely use publicly reported health care quality data. Despite known barriers to use, few studies have explored the effectiveness of strategies to overcome barriers in vulnerable populations.
METHODS: This randomized controlled trial tested the impact of a patient navigator intervention to increase consumer use of publicly reported quality data. Patients attending an urban prenatal clinic serving a vulnerable population enrolled between May 2013 and January 2015. The intervention consisted of 2 in-person sessions in which women learned about quality performance and viewed scores for local practices on the Massachusetts Health Quality Partners Web site. Women in both the intervention and control arms received a pamphlet about health care quality. Primary study outcomes were mean clinical quality and patient experience scores of the practices women selected (range 1–4 stars).
RESULTS: Nearly all (726/746; 97.3%) women completed the study, 59.7% were Hispanic, and 65.1% had a high school education or less. In both unadjusted and adjusted models, women in the intervention group chose practices with modestly higher mean clinical quality (3.2 vs 3.0 stars; P = .001) and patient experience (3.0 vs 2.9 stars; P = .05) scores. When asked to rate what factors mattered the most in their decision, few cited quality scores.
CONCLUSIONS: An intervention to reduce barriers to using publicly reported health care quality data had a modest effect on patient choice. These findings suggest that factors other than performance on common publicly reported quality metrics have a stronger influence on which pediatric practices women choose.
- CI —
- confidence interval
- CQ —
- clinical quality
- HEDIS —
- Healthcare Effectiveness Data and Information Set
- MHQP —
- Massachusetts Health Quality Partners
- PAM —
- Patient Activation Measure
- PE —
- patient experience
- P-PAM —
- Parent Activation Measure
What’s Known on This Subject:
Despite the substantial resources devoted to public reporting of health care quality data, use of these data has been limited, particularly among members of vulnerable populations. It is not clear what strategies may overcome barriers to use.
What This Study Adds:
In this large randomized controlled trial of a patient navigator intervention to promote use of quality data when choosing a pediatric practice, we found little effect on choice of pediatric practice for pregnant women from vulnerable populations.
Quality measures such as the Healthcare Effectiveness Data and Information Set (HEDIS) have been used by health plans and health systems to assess care effectiveness for >2 decades. Institutional performance on these measures has also been made available to the public in recent years.1 Public transparency is intended to improve health outcomes by enabling patients to identify and choose higher-quality practices and by stimulating competition.2–6 However, patients use these data infrequently,1,3,6,7 possibly because of lack of awareness that the data exist, challenges to interpreting the data, lack of perceived relevance, and competing decisional factors.7–13 Lower-income patients are least likely to use performance data,14,15 which may in turn contribute to health disparities.15
Choosing where to bring a newborn for routine pediatric care is a decision potentially well suited to use of quality data; there is sufficient time to weigh options between pregnancy onset and needing a pediatrician, multiple practices to choose from for most families, and variation in care.16 Patient navigators, which have been used in clinical settings to provide education and decision support,17 could serve a role in helping patients use quality data.
We conducted a randomized controlled trial to test the efficacy of a patient navigator intervention to increase the impact of publicly reported pediatric quality data on choice of pediatric practice by low-income and racial or ethnic minority pregnant women. We hypothesized that women exposed to the intervention would choose practices with higher quality scores compared with women who received only written information about health care quality.
Trained patient navigators recruited English-speaking women ages 16 to 50 years who were at 20 to 34 weeks’ gestation between May 2013 and August 2014.18 The study took place at a prenatal clinic that served a predominantly low-income, racial or ethnic minority population and was located in an urban tertiary care center. Women were excluded if they planned to deliver their newborn at a different institution. We obtained written informed consent; the study protocol19 was approved by the Baystate Medical Center Institutional Review Board.
Randomization and Blinding
Participants were randomly assigned, 1:1, to 1 of 2 study arms via a random number generator with stratification of randomization based on parity. Assignments were placed in sequentially numbered opaque envelopes that were pulled at the time of enrollment. The navigator collecting follow-up data on weekdays was blinded as to which arm of the study the participant was in. However, weekends were covered by 1 team member on a rotating basis. Because delivery dates were unpredictable, this meant that on some weekends the navigator may not have been blinded to a participant’s allocation status.
Baseline procedures took place at the study clinic during prenatal visits. The navigator gave participants a printed copy of the baseline instruments and read each question aloud to reduce the potential for misunderstanding. Baseline data included demographics, health literacy level (Newest Vital Sign),20 and the Patient Activation Measure (PAM) and Parent Activation Measure (P-PAM)21,22; race and ethnicity were self-reported. Health literacy and patient activation were included as potential effect modifiers because of their association with health information–seeking behaviors.23,24 We identified candidate factors that a parent might consider important when choosing a pediatric practice by using previous literature and clinical knowledge, pretested lists of candidate factors, and selected 12. Women ranked the importance of these factors on a 6-point Likert scale. We also asked what other factors participants considered important and whether they had an a priori preference for a specific practice. We offered participants $20 after completing baseline data collection and an additional $20 after completing the postdelivery interview.
The intervention consisted of 2 in-person navigator-led sessions; a pretested intervention guide script was used to ensure consistency. During session 1, the navigator used Massachusetts Health Quality Partners (MHQP) online Quality Reports25 to discuss the rationale for measuring health care quality and to explain pediatric quality measures.
MHQP is a nonprofit organization in Massachusetts that rates primary care practices and medical groups, all referred to as “practices” in the remainder of this article, based on their performance on clinical quality (CQ) and patient experience (PE) measures. A practice needed to have ≥3 clinicians and 2 HEDIS scores with ≥30 patients contributing to the measures to be included, or to be part of a larger group that met these criteria (Supplemental Information). MHQP uses data from 5 large commercial insurers in Massachusetts (CQ scores) and patient survey data (PE scores) to generate a practice’s scores, which are based on state and national benchmarks (Supplemental Information). In some health plans, and for some measures, administrative data are known to produce results that underestimate true performance. To address this limitation, MHQP has worked with the health plans and the MHQP Physician Council to define an adjustment method that helps account for differences in the rates obtained by using administrative data to measure performance and the rates that are obtained when medical records are reviewed. MHQP applied this adjustment method to all measures known to have discrepancies between these 2 data sources, to increase the measure rates and better approximate true performance. MHQP assigned stars for CQ measures as follows: 1 star is assigned for being eligible for inclusion, 1 for exceeding the national average for the measure, 1 for exceeding the national 90th percentile for the measure, and 1 for performance on the statewide rate. For PE, practices received 1 star if they scored lower than the 15th percentile, 2 stars if they scored between the 15th and 50th percentiles, 3 stars if they scored between the 50th and 85th percentiles, and 4 stars if they scored greater than the 85th percentile (Supplemental Information).
Using a laptop computer, the navigator showed participants the MHQP Web site and explained the star rating system, indicating that “ND” meant there were insufficient data to give the practice a score. She then showed the performance scores for the practices that a participant wanted to view. If a participant did not have any practices in mind, the navigator showed her practices near her home with high, moderate, and low scores. Navigators helped participants complete a practice comparison worksheet during the intervention session that participants could then take home. Participants were also given a fact sheet about the star ratings and an information sheet about how to access the MHQP Web site at home (Supplemental Information). During session 2, the navigator reviewed the quality measures, answered questions, and showed performance data for additional practices as desired.
Navigators gave participants in both the intervention and control arms an informational pamphlet about health care quality measures (Supplemental Information).
Follow-up procedures took place either in person in the hospital or by phone (n = 80, or 10.7%) after delivery. Phone follow-up took place on some weekends when the covering navigator could not get to the hospital in time to do an in-person follow-up. Research staff asked which pediatric practice the participant had selected, reassessed what factors mattered to her when she made her decision, and readministered the 2 activation measures.
We calculated separate CQ and PE summary performance scores for each practice, represented as the mean of a practice’s scores in each of these 2 categories. The primary study outcomes were the average of the CQ and PE summary performance scores for practices selected by women in each arm. Secondary outcomes included changes in PAM and P-PAM scores and changes in how much certain factors mattered to women when deciding on a practice.
We estimated that a sample size of 650 was necessary to have 80% power to detect a difference of 0.20 stars for CQ scores and 0.25 stars for PE scores with a significance level of 0.05 within strata defined by parity (Number Cruncher Statistical Software [NCSS] Power Analysis and Sample Size [PASS] 2008). This estimate would translate to meaning that for every 5 women who received the intervention, 1 would choose a practice with a score that was 1.0 star higher. SDs of scores for 25 local pediatric practices were used to power the study because there were no existing data defining a clinically meaningful difference in quality scores. This approach grounded the power calculation in actual rather than theoretical variation in scores.
Primary Outcome Analyses
Using an intent-to-treat analysis, we first tested for differences in baseline characteristics between intervention and control arms via χ2 or Fisher’s exact tests and t tests or Kruskal–Wallis tests. We also assessed differences in failing to select a practice before leaving the hospital or selecting an unscored practice. We evaluated study outcomes with analysis of variance models, adjusting for parity, and built additional models to adjust for baseline characteristics, including those that were unbalanced.
In subanalyses, we compared the intervention’s efficacy within population subgroups by including interaction terms for the intervention with baseline characteristics: parity, maternal age, race or ethnicity, health literacy, and patient activation. Health literacy scores were dichotomized to high or possible likelihood of low literacy (0–3) versus adequate literacy (4–6).26 PAM and P-PAM scores were dichotomized to low activation (levels 1 and 2) or high (levels 3 and 4) for regression models.21 Finally, we tested the effect of missed navigator sessions (according to protocol analysis).
Sensitivity and Secondary Outcome Analyses
We conducted a sensitivity analysis of our composite scores by repeating the primary analysis with the following outcome measures: a single measure that is most directly pertinent to newborns, “Well-child visits 0–15 months”; an overall performance score that combined the mean CQ and PE scores for practices that had scores for both measures; and an ordinal outcome defined as unscored, below the median, or above the median that included practices whether they had a CQ or PE score, fit via a cumulative logit model. During the study, we learned that some participants who were having their first child had a practice in mind and that some participants with children were considering changing practices. Therefore, we also tested whether baseline preference for a practice, regardless of parity, influenced the primary outcome.
We assessed the impact of the intervention on PAM and P-PAM scores by comparing within-subject change from baseline to follow-up while controlling for other participant characteristics via analysis of variance models.
To understand the mechanisms by which women chose a practice, we assessed the relative importance of the factors that mattered to them when choosing. To do so, we dichotomized the 6-point Likert scale responses into the following: “Did not matter” (response options: “did not consider,” “don’t know/not sure,” or “does not matter”) and “Mattered” (response options: “mattered a little,” “mattered somewhat,” or “mattered a lot”). We then used McNemar’s test to evaluate change in the importance of individual factors from baseline to follow-up within each arm of the study and Cochran Q to evaluate homogeneity in direction of change. All analyses were performed in SAS version 9.3 (SAS Institute, Inc, Cary, NC).
Among the 746 participants enrolled in the study, 366 were assigned to the intervention arm and 380 to the control arm; 20 (2.7%) either withdrew (n = 11) or were lost to follow-up (n = 9) (Fig 1). Median age was 23 years (interquartile range, 20–27 years), 59.7% self-identified as Hispanic, and 66.1% had a high school education or less. Baseline scores on the PAM ranged from 43.2 to 100 (mean = 66.8; SD = 14.3) and from 39.4 to 100 (mean = 71.1; SD = 15.3) for the P-PAM; lower scores indicate lower engagement in health management. Half (52.3%) of participants had an a priori preference for a practice. Losses to follow-up, exclusions, and baseline characteristics were balanced between study arms with the exception of race and ethnicity (Table 1).
Four participants (0.5%) did not choose a pediatric practice during the study period, and 43 (6.0%) chose practices that did not have ≥1 scored CQ measure (Table 2). More than half of participants (433, or 60.0%) chose practices that did not have PE scores (Table 3). Two participants missed intervention sessions 1 and 2, and 37 missed session 2 (Fig 1).
Study participants selected 53 unique scored practices; 5 practices accounted for nearly half (48%) of these selections (Table 2). The mean summary scores were 3.1 stars (SD = 0.5) for CQ and 2.9 stars (SD = 0.5) for PE. The number of women who selected a practice without quality CQ or PE scores was balanced between study arms.
The unadjusted mean CQ summary score was 3.2 stars (SD = 0.5) for practices selected by women in the intervention group and 3.0 stars (SD = 0.5) for controls (P = .001) (Table 4). Estimates did not change after we adjusted for baseline characteristics, including parity.
The unadjusted mean PE summary score for practices selected by intervention women was 3.0 stars (SD = 0.5) and 2.9 (SD = 0.5) for controls (P = .05). Adjusted models gave similar estimates; none of the interaction terms between intervention and patient characteristics were significant (P > .10 for all) (Table 4). When adjusted for missed navigator sessions, results did not differ from the intent-to-treat analysis.
Sensitivity Analyses and Secondary Outcomes
When practice selection was restricted to scores on “Well-child care 0–15 months,” women in the intervention group chose practices with an average score of 2.6 stars (SD = 0.7) and controls’ practices averaged 2.4 stars (SD = 0.7) (P = .01), a relative difference similar to that found with the summary CQ score (Table 4). The combined CQ plus PE composite score also performed similarly to the separate CQ and PE scores (Table 4). When exploring the ordinal outcomes, we found that women in the intervention group had 1.38 (95% confidence interval [CI], 1.00–1.83) higher odds of choosing a practice with a CQ score above the median compared with controls in adjusted models. There was not a significant difference in PE scores when we used the ordinal outcome.
When we evaluated the impact of having an a priori preference for a practice at baseline, women in the intervention arm with no preference selected practices with higher CQ scores than women in the control arm (adjusted; 3.2 vs 3.0 stars; P = .0003). In contrast, there was no difference among those with a baseline preference for a practice (3.1 vs 3.1 stars; P = .99) (Table 3). Similar patterns were found for PE and combined CQ plus PE scores (Table 4).
Change in PAM and P-PAM Scores
In adjusted models, PAM scores increased by 2.3 points (95% CI, 1.62–6.29) in the intervention arm and 3.2 points for controls, a nonsignificant difference (95% CI, 0.81–7.2; P = .83). P-PAM scores dropped by 1.5 points (95% CI, 5.96–2.97) in the intervention arm and by 1.7 points in the control arm (95% CI, 6.24–2.89) which was also not statistically significant (P = .88).
Importance of Quality Data to Women When Choosing a Practice
The percentage of women who reported that online quality data mattered to any degree (mattered a lot, somewhat, or a little) increased from baseline to follow-up (53% to 69%) in the intervention group and declined among controls (53% to 17%). The direction and magnitude of these changes differed significantly between groups (P = .0001). Women in both the intervention and control arms who had reported that quality ratings mattered to them selected practices with higher mean CQ scores (3.2 vs 3.0 stars; P < .0001). However, when asked what the most important factors in their decision were, only 37 (5.1%) of participants rated CQ performance as 1 of the most important factors, and 32 (4.4%) rated what other parents think of the practice (a PE measure) as most important. In contrast, 222 (30.6%) participants rated knowing the pediatrician they chose as one of the most important factors, and 150 (20.7%) rated how close the practice is to home as most important.
This is the first randomized controlled trial to assess the impact of a patient navigator intervention to overcome barriers to using quality data when choosing a pediatric practice. We found that pregnant women in the intervention group chose practices with only slightly higher quality scores. Secondary analyses suggested that factors other than formal CQ measures may partly explain the overall modest effect observed.
Much time and effort has been spent on developing publicly reported quality measures, with only limited effects seen on patient behavior.6,27–29 Several studies have tried to elucidate the multiple potential barriers to use.9,11,12,27,28,30,31 Most have focused on hypothetical choice experiments or simulated scenarios for choosing an insurance plan or a hospital. For example, in a convenience sample of 303 lower-income adults, investigators found that simplified data were better understood by participants but that better understanding did not necessarily affect choice of hospitals.32 Two studies explored the impact of Consumer Assessment of Healthcare Providers and Systems data on a real-life choice of health plan in a population of employed adults, finding no significant effect.33,34 Two other studies assessed the effect of quality performance report cards on federal employees’ choice of health plan, each finding a small increase in the selection of higher-scoring plans, but the effect was limited primarily to employees choosing a health plan for the first time.35,36 The current study extends previous work by testing the efficacy of an in-person intervention to facilitate use of quality performance data by lower-income pregnant women making an actual selection of a pediatric practice.
We studied a clinical scenario that was, in many respects, well suited to using quality data, yet the intervention had a moderate impact. Although we found that women in the intervention group were more likely than those in the control arm to report that online measures of quality mattered in their decision, few women rated quality measures as one of the factors most important to them. This finding suggests that factors other than conventional measures of quality may be more important in choosing a pediatric practice in the population studied. A large percentage of women chose 1 of 5 practices, which could indicate that the attractiveness of local practices, potentially related to proximity or familiarity of the practices, may be hard to overcome. Finally, although there is ongoing debate about the relationship between PE measures and care quality,37–40 there is widespread agreement that PE is an important outcome in its own right, and these scores may be of greater interest to consumers.13 Although women did not explicitly state that the absence of PE data was a concern, the absence of these data may have diminished the overall impact of the intervention. Understanding the decisional frameworks women currently use when choosing a pediatric practice will help inform efforts to increase consumer use of quality performance scores.
This study’s strengths included the randomized trial design, a large sample size, low loss to follow-up, focus on a vulnerable population, use of a well-established nonprofit’s public reporting Web site, and testing of a real-life decision. However, the results should be considered in light of several limitations. First, this was a single-center study of 1 decision, potentially limiting generalizability to other populations and other decisions. Nevertheless, given the near ideal test case of choosing a pediatric practice during pregnancy, this study’s findings may be sobering to those who champion the use of quality data to affect consumer behavior. Second, women did not see a composite score when they viewed the Web site; it is possible that women focused on a subset of the components of the composite rather than overall practice performance, but our sensitivity analysis of a single measure was consistent with the findings of the main analysis. Third, desirability bias may have influenced responses to some questions, such as how the importance of quality performance data were rated, but women in both arms who reported that these data were important also chose higher-scoring practices, indicating that it did matter to these women. Fourth, although navigators may not have been blinded when collecting follow-up data on weekends, this limitation would not have affected the internal validity of the primary study outcome (pediatric practice selected). Finally, HEDIS measures were originally created to provide information to purchasers on health plan quality. In recent years, HEDIS and similar measures have been made publicly available, providing consumers with direct access to performance data on quality at the health plan, medical group, and practice level. Although these data are imperfect, in most cases they are the only information (other than anecdotal information or Yelp-type ratings) available to consumers.
An intensive in-person patient navigator intervention to overcome barriers to consumer use of publicly available quality performance data had little impact on the decisions made by lower-income women choosing a pediatric practice for their newborn. Future efforts to engage consumers in use of publicly reported quality data should consider how competing factors might influence choice, how strategies to disseminate quality data fit into the process by which patients currently choose a physician, and what new measures may resonate best with parents.
We thank the Wesson Women’s Clinic Staff at Baystate Medical Center for their gracious acceptance of study staff’s presence in their clinic and their help facilitating recruitment for the study. We also thank MHQP for their advice and support during this study.
- Accepted June 20, 2016.
- Address correspondence to Sarah L. Goff, MD, The Center for Quality of Care Research and Department of Medicine, Baystate Medical Center/Tufts University School of Medicine, 759 Chestnut St, Springfield, MA 01199. E-mail:
This work was presented in part at the national meeting of AcademyHealth; June 2014; San Diego, CA and June 2015; Minneapolis, MN, and at the national meeting of the Pediatric Academic Society; May 2013; Vancouver, BC.
FINANCIAL DISCLOSURE: Dr White consults for Actavis; the other authors have indicated they have no financial relationships relevant to this article to disclose.
FUNDING: Funded by a grant from the Agency for Healthcare Research and Quality (grant 5R21HS021864-02). Dr Goff is supported by the National Institute of Child Health and Human Development of the National Institutes of Health (NIH) under award K23K23HD080870. Dr Lagu is supported by the National Heart, Lung, and Blood Institute of the NIH under award K01HL114745. The funding agencies had no role in any aspect of study design, implementation, analyses, or manuscript review or approval. Funded by the National Institutes of Health (NIH).
POTENTIAL CONFLICT OF INTEREST: The authors have indicated they have no potential conflicts of interest to disclose.
COMPANION PAPER: A companion to this article can be found online at www.pediatrics.org/cgi/doi/10.1542/peds.2016-2447.
- Kolstad JT,
- Chernew ME
- Totten AM,
- Wagner J,
- Tiwari A,
- O’Haire C,
- Griffin J,
- Walker M
- Magee H,
- Davis L-J,
- Coulter A
- Hibbard JH,
- Peters E,
- Dixon A,
- Tusler M
- Uhrig JD,
- Short PF
- Raven MC,
- Gillespie CC,
- DiBennardo R,
- Van Busum K,
- Elbel B
- Shah LC,
- West P,
- Bremmeyr K,
- Savoy-Moore RT
- Pennarola BW,
- Rodday AM,
- Mayer DK, et al; HSCT-CHESS Study
- Massachusetts Health Quality Partners
- Weiss BD,
- Mays MZ,
- Martz W, et al
- Hibbard JH,
- Berkman N,
- McCormack LA,
- Jael E
- Hibbard JH,
- Greene J,
- Sofaer S,
- Firminger K,
- Hirsh J
- Hibbard JH,
- Stockard J,
- Tusler M
- Schlesinger M,
- Kanouse DE,
- Martino SC,
- Shaller D,
- Rybowski L
- Peters E,
- Dieckmann N,
- Dixon A,
- Hibbard JH,
- Mertz CK
- Farley DO,
- Elliott MN,
- Short PF,
- Damiano P,
- Kanouse DE,
- Hays RD
- Copyright © 2016 by the American Academy of Pediatrics