Public Perceptions of the Benefits and Risks of Newborn Screening
BACKGROUND: Growing technological capacity and parent and professional advocacy highlight the need to understand public expectations of newborn population screening.
METHODS: We administered a bilingual (French, English) Internet survey to a demographically proportional sample of Canadians in 2013 to assess preferences for the types of diseases to be screened for in newborns by using a discrete choice experiment. Attributes were: clinical benefits of improved health, earlier time to diagnosis, reproductive risk information, false-positive (FP) results, and overdiagnosed infants. Survey data were analyzed with a mixed logit model to assess preferences and trade-offs among attributes, interaction between attributes, and preference heterogeneity.
RESULTS: On average, respondents were favorable toward screening. Clinical benefits were the most important outcome; reproductive risk information and early diagnosis were also valued, although 8% disvalued early diagnosis, and reproductive risk information was least important. All respondents preferred to avoid FP results and overdiagnosis but were willing to accept these to achieve moderate clinical benefit, accepting higher rates of harms to achieve significant benefit. Several 2-way interactions between attributes were statistically significant: respondents were willing to accept a higher FP rate for significant clinical benefit but preferred a lower rate for moderate benefit; similarly, respondents valued early diagnosis more when associated with significant rather than moderate clinical benefit.
CONCLUSIONS: Members of the public prioritized clinical benefits for affected infants and preferred to minimize harms. These findings suggest support for newborn screening policies prioritizing clinical benefits over solely informational benefits, coupled with concerted efforts to avoid or minimize harms.
- DCE —
- discrete choice experiment
- FP —
- NBS —
- newborn screening
- SSI —
- Survey Sampling International
What’s Known on This Subject:
Infant screening is valued by members of the lay public, but how different benefits are independently valued, and whether harms are disvalued, is not known. Public expectations of screening can inform decisions about what diseases to screen for.
What This Study Adds:
The public values clinical benefits of screening and disvalues harms, with tolerance for harm proportional to clinical benefit. These findings support newborn screening policies prioritizing clinical benefits over solely informational benefits, coupled with concerted efforts to avoid or minimize harms.
Propelled by technological developments and parent and professional advocacy,1–4 newborn screening (NBS) programs have expanded markedly, fostering debate about the relative importance of the several outcomes of NBS (clinical improvements, early diagnosis, reproductive risk information)5–7 and how these potential benefits should be traded off against potential harms (false-positives [FPs], overdiagnosis of mild disease).8–10 Given the public interest in a careful balance between the benefits and burdens of programs that enroll large portions of the public and measurably affect the public’s health, there is a need to start “discussing screening with the public”11 to understand the nature of public expectations.
To date, empirical data provide limited insight about public expectations. Evidence suggests that infant screening is valued by invested stakeholders (parents of NBS-identified infants, clinicians)8–10,12–16 and members of the lay public.1,17 Yet, little is known about how the different benefits of screening are independently valued, and thus what types of benefits give warrant to NBS. Those who call for an expanded definition of benefit suggest that informational benefits may suffice,18,19 whereas others emphasize a hierarchy of benefits, with screening justified only by primary benefits, even though secondary benefits may be important.20,21 In addition, although early diagnosis is often identified as an important benefit of NBS, to avoid the so-called diagnostic odyssey and support life planning,22 preferences for such an outcome are not necessarily uniform.1 Finally, the burdens of screening are ignored in much of the literature on stakeholder attitudes,12–16 so we lack data on whether harms are actually and uniformly disvalued. Even where harms are probed, we lack insight into the willingness to trade between different benefits and between benefits and harms.8–10 To assess how the varied outcomes of NBS are valued independently and relative to others, we conducted a stated preference discrete choice experiment (DCE) to engage members of the public about the types of diseases they would recommend be screened for in newborns.
The merits of DCEs have led to increased application in health policy,23,24 and they offer particular advantages in measuring preferences for population screening programs. Designed to assess how preferences for one outcome are valued relative to another outcome, DCEs can measure how people rank the several benefits of NBS (eg, reproductive risk information relative to clinical benefits) and trade benefits for harms (eg, clinical benefits for affected infants relative to FP results in unaffected infants). In addition, DCEs can independently measure preferences for concurrent outcomes, such as those associated with early diagnosis (eg, early diagnosis of disease unto itself, and the clinical benefits that early diagnosis often yields). Furthermore, in recognition of the contested utility of outcomes such as early diagnosis, and the potential for misunderstanding of complex concepts such as overdiagnosis, newer DCE methods allow measurement of heterogeneity in the direction of preferences, given that some respondents may positively value an outcome whereas others negatively value it.
With approval from the Health Sciences Research Ethics Board at the University of Toronto, we conducted a national cross-sectional survey of Canadians on expectations of NBS using a DCE.
Members of the public were recruited through an Internet panel from Survey Sampling International (SSI), a company specializing in online data collection. Over 2 weeks (January 2013), SSI invited panelists; those who met Canadian population criteria25 for age, gender, and region of residence were eligible. To recognize time invested, SSI provided an incentive (eg, sweepstakes, prize drawings, or cash, as preferred) to eligible panelists who completed any of 3 questionnaire sections. Section completion required answers to all items, ensuring no missing data. We followed generally accepted guidance to estimate a sample size of 1200, to permit subgroup analysis by participant characteristics, and for respondents exposed or not to the reasoning exercise.26
DCEs elicit preferences by asking individuals to choose between different options, each of which is described by a number of attributes. The assumption is that services or policies can be described by their attributes. People assign their preferences to attribute levels and choose the most preferred option from available alternatives. From people’s choices, indirect utility can be estimated.27,28
The study team developed the questionnaire on the basis of previous qualitative research1,29 and a literature review.9,30–34 It was pretested by using cognitive interviews (n = 16 respondents recruited through online advertisements), then piloted (n = 87 respondents) through SSI. The survey was developed and pre- and pilot-tested in English, then translated into French.
The questionnaire began with an extensive training module (Supplemental Information 1; Training Module, section 1) to familiarize respondents with the attributes and levels used in the DCE. With a professional designer, we developed a strategy to clearly convey population screening concepts, including the types of severe, child-onset diseases typically screened for, the 2-step screening process (initial testing, confirmatory testing), the outcomes of screening for families of affected infants (early diagnosis, clinical outcomes of early treatment, reproductive risk information), and the unintended outcomes for other families (FP results, overdiagnosis). After each element was explained, a set of true/false quizzes assessed understanding, followed by real-time corrected answers (Supplemental Information 1). These items were summed to generate a measure of understanding (scored as 0–21). The questionnaire also measured selected attitudes and demographic characteristics.
For the DCE (section 2), we asked participants to imagine that they were “advising the government about the types of diseases to screen for in newborns.” In each choice set, participants were asked to choose which of 2 diseases they would prefer to screen for, or whether they would prefer that neither disease be screened (Fig 1; example choice set).35 The choice sets incorporated 5 attributes: earlier time to diagnosis (1 week to 4 years), clinical benefits of early treatment (none, moderate, significant), early reproductive risk information (available, not available), FP results (1–40 per affected infant), and overdiagnosed infants (0–2 per affected infant) (Table 1).
We did not include false-negative results as an attribute but explicitly noted that screening was designed to find almost all infants with a disease, such that these results were very rare. In the scenario provided alongside the DCE, we reiterated information about the types of rare diseases screened for, and that the small risk of false-negative results was to be held constant across choices (Supplemental Information 2).
To test the effect of being exposed to value-based reasons, half of the respondents were randomly assigned to a reasoning exercise in which they were asked to select the most important among 6 reasons for their selection for each choice set (eg, maximize health benefits for affected infants versus minimize harms to others; Supplemental Information 3).30 We examined error variance between groups to assess the effect of being exposed to the reasoning exercise.
Model Estimations and Data Analysis
The choice tasks were constructed by using SAS version 9.2 (SAS Institute, Cary, NC).36 Estimates from the pilot study informed the previous parameters for the D-efficient experimental design. The design procedures reduced the total number of possible choices to 48 choice sets, which were grouped into 6 sets of 8 choice tasks, to which respondents were randomly assigned.
The discrete response data were analyzed in Stata 12 (StataCorp, College Station, TX) by using an error components generalized multinomial logit model.37–39 The generalized multinomial logit model can account for the fact that each person completed 8 choices (ie, choices are not independent) and allows for heterogeneity of scale (ie, implying choice behavior is more variable for some than others) and heterogeneity in respondent preferences.38 The latter is important where outcomes may be positively valued by some and disvalued by others because of differences in preferences or misunderstanding. Results are presented as mean part-worth utilities, which estimate preferences for each level within an attribute. Absolute differences in parameter estimates between levels indicate the magnitude of preference for moving from one level to another, for example, the transition from moderate to significant health improvement.
Each attribute, statistically significant 2-way interactions between attributes, and the “neither testing” alternative (ie, that neither disease be screened for) were included in the model. Tests for interaction effects between participant characteristics and attribute levels showed no statistically significant influence on preferences; participant characteristics were not included in the final model. The neither-test alternative was coded such that if individuals had an average preference for NBS, the parameter for the alternative representing neither testing would be negative.23,40 Effects coding was used for categorical attributes. The attributes representing early diagnosis, the provision of reproductive risk information, rate of overdiagnosis, and rate of FPs were assigned normal heterogeneity distributions, which allows respondents to have positive or negative utility value. The attribute representing the clinical outcomes of early treatment and 2-way interaction effects were specified as fixed.41 We tested each continuous variable for nonlinearity by assessing the square of the variable, retaining those that were statistically significant in the final model.
The benefit-harm trade-offs between the attributes of clinical benefit of early treatment and rates of overdiagnosis and FPs were estimated by using a compensating variation formula.42 The SEs surrounding the benefit-harm metrics were estimated by using the delta method.43 The relative importance of each attribute was calculated such that the importance values of the attributes add to 100%.44
The survey participation rate (ie, proportion of visitors to the invitation page who started the survey) was 94% (n = 2345).45 The survey was long and complex; thus, to minimize disengaged completion, respondents were rewarded by section and asked for permission to continue. Of those who started, 907 dropped out before completing the second section and 225 were excluded for quality reasons (eg, less than minimum completion times per section) for a 52% completion rate for section 2 (n = 1213). Most respondents (79.8%) completed the English-language survey.
Our sample was reflective of the Canadian population by age, gender, and region but was better educated and had a more narrowly distributed income than Canadian averages (P < .001). Understanding of screening concepts was high (mean score = 18.86/21, SD = 2.24) (Table 2).
Respondents who completed section 2 (n = 1213) were more likely to be female (P < .001), to be older (P < .01), and to score better in understanding (P < .01) than those who stopped after section 1 (n = 669). There was no difference in whether they had children or a family history of genetic disease.
Respondents randomly assigned to the reasoning exercise showed less error variance than those who were not exposed (Table 3). However, this difference was not statistically significant; thus, results are reported for the full sample.
On average, respondents positively valued NBS; opting-out was chosen in only 2.8% of the scenarios. Among attributes for which we assessed preference heterogeneity, preferences across respondents were consistent for 3: all respondents positively valued reproductive risk information (100%); all respondents disvalued FP (100%) and OD (100%). Preferences for earlier diagnosis were heterogeneous: whereas the average utility estimate was positive, 8% of respondents disvalued earlier diagnosis (92% positively valued this outcome). As seen in Table 3, all attributes had a statistically significant difference from zero influence on choice. Among clinical outcomes, respondents showed the greatest preference for the transition from no impact of screening on health toward significant health improvement, compared with the transition from no impact of screening on health toward moderate health improvement (see Fig 2 for mean part-worth utilities depicted as a function of attributes).
Several of the assessed 2-way interactions between attributes were statistically significant. Specifically, respondents were willing to accept a higher FP rate where affected infants gained significant improvement in health; however, respondents preferred a lower FP rate where affected infants gained only moderate improvement in health. Similarly, respondents valued early diagnosis more highly where affected infants gained significant health improvement but less highly where health improvement was moderate. Together with the FP rate, the statistically positive coefficient for the square of the FP rate shows that although respondents strongly disvalued FP results for the first few infants, their dislike was moderated as the number of FP results increased.
Relative Importance of Attributes
Importance scores represent the relative weight each attribute had on respondents’ choices. The clinical benefit of early diagnosis was the most important attribute, followed by early diagnosis itself, which was followed by the 2 harm attributes (overdiagnosis, FPs). Reproductive risk information was the least important attribute (Fig 3).
We estimated respondents’ willingness to trade the positively valued clinical benefits of NBS for affected infants against the negatively valued harms (FPs, overdiagnosis) for other infants and their families. As Table 4 shows, respondents were willing to make these trades, with the expected increase in tolerance for harms in exchange for greater clinical benefits.
Through a DCE with members of the public in Canada, we offer insight into how the multiple outcomes of NBS are valued, both independently and relative to each other. A first conclusion is that although the public highly valued NBS, with few opting out of the opportunity to screen, 100% of our respondents preferred to avoid the harms of screening. Furthermore, the willingness to tolerate burdens to unaffected infants and their families depended on the extent of clinical benefits for affected infants. Respondents were willing to trade higher numbers of infants exposed to the harms of FPs or overdiagnosis to achieve significant clinical benefits compared with moderate clinical benefits. In addition, the interaction between these attributes showed that the degree of tolerance for the incidence of harms was influenced by the degree of benefit. That is, whereas respondents were willing to accept a higher FP rate to achieve significant clinical benefits, they required a lower FP rate for moderate clinical benefits. A further conclusion concerns the relative and mixed preferences for some of the informational outcomes of NBS. Specifically, we show that one of the outcomes of NBS that is typically discussed among advocates of expansion as a benefit (ie, early knowledge of disease) is interpreted by a minority of respondents as a harm (ie, it is disvalued). Furthermore, respondents’ valuation of early diagnosis was clearly linked to the clinical benefits that it could support, as revealed by the statistically significant interaction between these attributes. Earlier diagnosis was valued more when combined with significant health improvements and less when combined with moderate health improvement, further calling into question the strength of preference for this outcome. Finally, reproductive risk information was the least important attribute, suggesting that this informational benefit in isolation might not be sufficient to warrant screening.
Our rigorous approach to engaging respondents, with a detailed training module that incorporated clear textual and visual depictions of screening concepts, educational quizzes, and feedback with corrected answers, reduces many sources of survey bias and gives us confidence in the quality of these data.24 However, several limitations must be acknowledged. A primary limitation relates to the ratios of FP and overdiagnosis cases to affected cases. We drew these ratios from the limited data available and consultation with experts.6,46–50 However, because of the risk of confusion in representing 2 ratios, we elected to retain the same denominator (ie, number of affected cases) for both; thus, the upper limit in our ratio of overdiagnosis to affected cases (2:1) may be an overestimate of the worst-case scenario. In addition, the introduction of ranges for FP and overdiagnosis results is likely to have had a strong framing effect; the statistical significance of the FP-squared variable in our model suggests that although respondents had a strongly negative reaction to small numbers of FP results, they became inured to this harm as the numbers increased, more easily accepting still more FP cases where larger numbers of FP cases were presented. Because of these several limitations, we do not place great interpretive weight on the trade-off values identified (ie, number of overdiagnosis cases traded off to achieve cases with clinical benefit), although we remain confident in the main findings of our study regarding the relative valuation of attributes. Second, we elected not to include false-negative results as an attribute because of the rarity of this outcome and the complexity of introducing an additional risk-based attribute. However, we explicitly noted that this rare event was to be held constant across the choices, so although we lack data on preferences for this outcome, we are confident that our data remain robust in its absence. Furthermore, although the French version of the survey was not pretested, post hoc analyses suggest no language effect on preferences. Finally, the study was conducted with an Internet panel of Canadian residents, who were significantly different from Canadian averages on some measured demographic characteristics, and who may also have differed on unmeasured characteristics, such as ethnicity, thus limiting the generalizability of these findings.
Limitations notwithstanding, these findings extend a limited literature on how the public appreciates and balances the benefits and harms of population screening51–53 and offer important insights into public values for the types of diseases that should be screened for in newborns. Our study aligns with existing literature in showing strong support for NBS1,13,14,54 but expands significantly on what is known about stakeholder expectations. Much of the existing empirical research has failed to attend to any harms,13–16,55 whereas studies that have explicitly considered harms (although not overdiagnosis) showed some acknowledgment by parents or members of the public but do not illuminate how harms and benefits should be traded off.1,9,54,56 These findings also add to a broader literature on attitudes toward population screening by exploring several complex harms.57–66 Specifically, that 100% of our respondents showed a statistically significant preference to avoid harms (FPs, overdiagnosis) is important, because misunderstanding of risk-based harms is common,67 and recent work exploring attitudes toward overdiagnosis in the context of breast cancer screening has shown considerable confusion as well as limited valuation.68–71 Furthermore, the identification of both positive and negative preferences for early diagnosis reinforces qualitative research that suggests the complexity of beliefs about early knowledge of disease in an infant, including concern about the risk of unwanted knowledge and negative consequences for the parent-child bond.1,9,54 The identification of a negative preference is an important corrective to the literature that identifies early knowledge as valuable in itself, by permitting family adjustment and planning, and averting difficult “diagnostic odysseys.”18,19,22 Finally, our finding that reproductive risk information was the least important attribute should factor into deliberations about the pursuit of reproductive benefit through NBS.18–21
These findings provide novel and important insight into what “wider and clear-eyed discussion[s] with the public about the magnitude of benefits and harms of screening”11 can generate. Our study shows that members of the public value clinical benefits for affected infants and are willing to accept harms to unaffected infants and their families in proportion to the clinical benefits that can be realized. Significantly, members of the public preferred to avoid the harms of screening, with tolerance for harms reduced where clinical benefits were more moderate. Furthermore, a small but meaningful minority of respondents preferred to avoid early knowledge, and support for this outcome remained linked to the benefits that early diagnosis might yield through clinical treatment. These findings suggest support for NBS policies prioritizing clinical benefits over solely informational benefits, coupled with concerted efforts to avoid or minimize harms.
We thank the study participants for the time and insights they provided for our study. We also thank Mr Dan Farrow of squarebracket.net for his design work.
- Accepted May 6, 2015.
- Address correspondence to Fiona A. Miller, PhD, Institute of Health Policy, Management, and Evaluation, University of Toronto, 155 College St, 4th floor, Toronto, ON, M5T 3M6, Canada. E-mail:
Dr Miller led the study and led the development of the questionnaire items that are the focus of this manuscript and the drafts and revisions of the manuscript; she also helped develop the data analysis plan; Dr Hayeems, Dr Bombard, Ms Cressman, and Ms Barg were involved in study design and oversight, helped develop the questionnaire items that are the focus of this manuscript, reviewed initial data memos, and offered suggested revisions to versions of the manuscript; Dr Carroll, Dr Wilson, Dr Little, Dr Allanson, Dr Chakraborty, and Dr Giguère were involved in study design and oversight, reviewed initial data memos, and offered suggested revisions to versions of the manuscript; Dr Regier developed the experimental design and the data analysis plan, conducted the data analysis, reviewed initial data memos, and offered suggested revisions to versions of the manuscript; and all authors approved the final manuscript as submitted.
FINANCIAL DISCLOSURE: The authors have indicated they have no financial relationships relevant to this article to disclose.
FUNDING: Supported by an investigator-initiated research grant from the Canadian Institutes of Health Research (CIHR); the funder had no role in study design; the collection, analysis, and interpretation of data; the writing of the report; or in the decision to submit the article for publication. Dr Bombard was supported by a fellowship from the CIHR and is currently a CIHR New Investigator. Dr Little holds a Canada Research Chair in Human Genome Epidemiology. Dr Giguère is a senior research scholar with the Fonds de Recherche du Quebec– Sante.
POTENTIAL CONFLICT OF INTEREST: Drs Miller, Hayeems, Carroll, Little, and Allanson have acted as advisors to newborn screening programs; Dr Chakraborty runs Newborn Screening Ontario; Dr Giguère runs Quebec’s newborn blood spot screening program; the other authors have indicated they have no potential conflicts of interest to disclose.
- ↵Baily MA, Murray TH. Ethics and Newborn Genetic Screening: New Technologies, New Challenges. Baltimore, MD: Johns Hopkins University Press; 2009
- ↵Pollitt RJ, Green A, McCabe C, et al. Neonatal screening for inborn errors of metabolism: cost, yield and outcome. Health Technol Assessment. 1997;1(7):i–iv, 1–202
- Grosse SD,
- Boyle CA,
- Kenneson A,
- Khoury MJ,
- Wilfond BS
- Kerruish NJ
- Harris R
- Parsons EP,
- Clarke AJ,
- Hood K,
- Lycett E,
- Bradley DM
- Lipstein EA,
- Nabi E,
- Perrin JM,
- Luff D,
- Browning MF,
- Kuhlthau KA
- Koopmans J,
- Ross LF
- Alexander D,
- van Dyck PC
- Ryan M,
- Gerard K,
- Amaya-Amaya M
- ↵Statistics Canada. Census of Population. 2010. Available at: http://www12.statcan.gc.ca/census-recensement/2011/rt-td/index-eng.cfm#tab5. Accessed March 21, 2013
- ↵Orme B. Sample Size Issues for Conjoint Analysis Studies. Sawthooth Software Research Paper Series. Squim, WA: Sawthooth Software; 1998
- ↵McFadden D. Conditional logit analysis of qualitative choice behavior. In: Zarembka P, ed. Frontiers of Econometrics. New York: Academic Press; 1973
- Bombard Y,
- Miller FA,
- Hayeems RZ,
- et al
- Johri M,
- Damschroder LJ,
- Zikmund-Fisher BJ,
- Kim SY,
- Ubel PA
- Kim M,
- Blendon RJ,
- Benson JM
- Gu Y,
- Hole AR,
- Knox S
- Train KE
- ↵Rosen HS, Small KA. Applied Welfare Economics With Discrete Choice Models. Cambridge, MA: National Bureau of Economic Research Cambridge; 1981
- Louviere J,
- Hensher D,
- Swait J
- Coulm B,
- Coste J,
- Tardy V,
- et al.,
- Dépistage Hyperplasie Congénitale des Surrénales France Study Group
- Schulze A,
- Lindner M,
- Kohlmüller D,
- Olgemöller K,
- Mayatepek E,
- Hoffmann GF
- Gigerenzer G,
- Mata J,
- Frank R
- ↵Whitehead NS, Brown DS, Layton CM. Developing a Conjoint Analysis Survey of Parental Attitudes Regarding Voluntary Newborn Screening. Research Triangle Park, NC: RTI International; 2010
- Ghanouni A,
- Halligan S,
- Taylor SA,
- et al
- Waller J,
- Douglas E,
- Whitaker KL,
- Wardle J
- Hersch J, Jansen J, Barratt A, et al. Women’s views on overdiagnosis in breast cancer screening: a qualitative study. BMJ. 2013;346
- Hersch J,
- Jansen J,
- Barratt A,
- et al
- Copyright © 2015 by the American Academy of Pediatrics