BACKGROUND. Inborn errors of metabolism are a significant cause of morbidity and death among children. Inconsistencies in how individual states arrive at screening strategies, however, lead to marked variations in testing between states.
OBJECTIVE. To determine the cost-effectiveness of each component test of a multitest newborn screening program, including screening for phenylketonuria, congenital adrenal hyperplasia, congenital hypothyroidism, biotinidase deficiency, maple syrup urine disease, galactosemia, homocystinuria, and medium-chain acyl-CoA dehydrogenase deficiency.
METHODS. A decision model was used, with cohort studies, government reports, secondary analyses, and other sources. Discounted costs, quality-adjusted life-years (QALYs), and incremental cost-effectiveness ratios were measured.
RESULTS. All except 2 screening tests dominated the “no-test” strategy. The 2 exceptions were screening for congenital adrenal hyperplasia, which cost slightly more than $20000 per QALY gained, and screening for galactosemia, which cost $94000 per QALY gained. The screening test with the lowest expected cost was tandem mass spectrometry. The results found in our base-case analysis were stable across variations in nearly all variables. In instances in which changes in risks, sequelae, costs, or utilities did affect our results, the variation from base-case estimates was quite large.
CONCLUSIONS. Newborn screening seems to be one of the rare health care interventions that is beneficial to patients and, in many cases, cost saving. Over the long term, funding comprehensive newborn screening programs is likely to save money for society.
Inborn errors of metabolism are a significant cause of morbidity and death among children.1–4 Technology has advanced remarkably since screening began for phenylketonuria (PKU) >40 years ago.5 Although we have technology today to screen for a host of diseases, which diseases are screened for differs according to state. Every year ∼4 million dried blood spots are analyzed for metabolic and hematologic disorders and endocrinopathies.6 These diseases account for ∼3000 cases of potentially disabling or fatal outcomes each year.6 The diseases for which we screen and the manner in which we screen vary significantly according to state. These decisions should be based on disease prevalence, costs of testing, treatment, and complications, and effectiveness of timely intervention. Inconsistencies regarding how states arrive at these decisions, however, lead to marked variations in testing among states.
Tandem mass spectrometry (MS/MS) offers the ability to increase drastically the number of diseases for which we screen.6–8 More than 20 disorders of fatty acid oxidation and organic acidemia are already screened for routinely with MS/MS methods.9 The number of diseases screened for and the number of states using the technology are increasing rapidly.10
The significant start-up costs of MS/MS screening, however, have prevented some states from adopting it as part of their standard screening process.11 Many have pointed out that most of the diseases for which MS/MS methods screen have a relatively low prevalence.9,10 MS/MS screening can lead to false-positive or abnormal results for well children, yielding increased costs and unnecessary parental angst.12,13 In addition, some diseases are so severe that, even with an ideal screening program, many children would not be treated in time to prevent morbidity and death.
Other data argue for the merits of broader screening even in light of these issues. Although costs for wider screening can be significant, the costs saved through preventing hospitalizations for diagnosis or long-term disability are also significant.11 Studies show that even diseases without a consistently effective treatment can benefit from early diagnosis.7 Evidence indicates that, overall, screening reduces stress while improving outcomes.12
A more objective method for determining the usefulness of any new technology is cost-effectiveness analysis. A number of these analyses were conducted previously for other aspects of newborn screening14,15 or the use of MS/MS screening for a small number of disease processes.1,9,10 Currently, programs do not screen for biotinidase deficiency and galactosemia with MS/MS methods, but there is clear evidence that galactosemia could be detected with this technology.16
To date, no cost-effectiveness study has examined the prospect of incorporating broad MS/MS testing into a comprehensive newborn screening program. The objective of this study was to determine the cost-effectiveness of a comprehensive newborn screening program, including screening for PKU, congenital adrenal hyperplasia (CAH), congenital hypothyroidism (CH), biotinidase deficiency, maple syrup urine disease (MSUD), galactosemia, homocystinuria, and medium-chain acyl-CoA dehydrogenase (MCAD) deficiency, alone or in combination with diseases detectable with MS/MS methods.
We developed a decision model with decision analysis software (DATA 4.0; TreeAge Software, Williamstown, MA), to compare various strategies for screening for newborn diseases. We adopted the recommendations of the Panel on Cost-Effectiveness in Health and Medicine for conducting and reporting a base-case analysis.17 We assumed a societal perspective to produce results, permitting comparisons across different interventions, and expressed results in terms of discounted costs, quality-adjusted life-years (QALYs), and incremental cost-effectiveness ratios.
Decision Model Structure and Assumptions
To frame the model for analysis, we made a number of simplifying assumptions. We assumed that any child could have, at most, 1 of the conditions for which the tests screen. We used a multiplicative utility model, that is, for outcomes with >1 complication, we multiplied the quality adjustments (utilities) for all relevant complications together. We assumed the existence of a comprehensive screening program that ensures timely testing, result notification, and appropriate treatment of all positive screens. Finally, we assumed that MS/MS methods would be used to screen for multiple conditions.
The Decision Model
Figure 1 shows the decision model structure. The decision tree consisted of a decision node whose branches represent 8 screening tests, that is, PKU, CAH, CH, biotinidase deficiency, MSUD, galactosemia, homocystinuria, and MS/MS tests. We assumed that MS/MS would be used to detect PKU, biotinidase deficiency, MSUD, galactosemia, and homocystinuria, as well as MCAD deficiency. In a second analysis, we compared MS/MS directly with a screening panel that included PKU, biotinidase deficiency, MSUD, galactosemia, and homocystinuria detection with conventional methods.
Following each test branch is a chance node whose branches represent the 8 conditions that the child may have, ie, PKU, CAH, CH, biotinidase deficiency, MSUD, galactosemia, homocystinuria, or MCAD deficiency, or “no disease.” Each disease branch is followed by a subtree that represents potential sequelae of the disease. In Fig 1, final outcomes are shown only for homocystinuria. PKU was associated with a risk of mild, moderate, or severe developmental delay and/or seizure disorder. CAH could lead to early childhood death (average life expectancy: 1 year). CH could cause mild, moderate, or severe delay. Biotinidase deficiency could lead to any combination of seizure disorder, severe developmental delay, hearing loss, or blindness. MSUD could lead to any combination of severe or mild cerebral palsy (CP) or mild, moderate, or severe developmental delay. Galactosemia could lead to death in infancy or moderate or severe delay. Homocystinuria had a risk of mild, moderate, or severe delay and/or vision loss. MCAD deficiency could lead to early death (modeled as death at 1 year, on average) or CP and/or mild-to-severe developmental delay. Children with no disease face a risk of a false-positive newborn screening result and its attendant expenses.
The probability of each of the outcomes with screening, P(outcome|screen), was represented by the formula P(outcome|screen) =P(outcome) × [1 − (efficacy × sensitivity)], where P(outcome) is the probability of the outcome (eg, CP or delay), without screening or treatment, derived from the literature (see below), efficacy is the efficacy of early detection and treatment in preventing the outcome, and sensitivity is the sensitivity of the screening test for the disease.
Data and Assumptions
Probabilities of sequelae, quality-adjusted survival rates, and costs for treatment were derived from various clinical and administrative data sources. We searched Medline (from January 1980 to November 2002) for studies with the terms “newborn screening” and any of the disease entities we were investigating. We also searched online Internet databases for the same terms. We supplemented this strategy by hand-searching citation lists from articles identified in the electronic search and recent reviews and cost-effectiveness analyses of other newborn screening strategies.
The prevalence of each disease was estimated, when possible, from the National Newborn Screening Report.18 Supplemental information was taken from the American Academy of Pediatrics newborn screening fact sheets and secondary analyses.8,19 We determined the sensitivities and specificities of the various screening tests through national reports and other secondary analyses.10,13,18,20 Probabilities of various sequelae and the effectiveness of early intervention in preventing these sequelae were also estimated from the newborn screening fact sheets, secondary data analyses, and cohort studies of patients with the diseases in question.19,21–28 Base-case probabilities for major complications are summarized in Table 1.
Estimation of Life Expectancy
We estimated the average life expectancy of normal patients from the National Vital Statistics Reports published by the Centers for Disease Control and Prevention.29 We estimated the effect of disability on life expectancy through a cohort study on the subject.30
We adjusted life expectancy for quality of life by using health state utilities.17 We calculated QALYs by multiplying remaining years of life by the utility value for each health state. Utilities for most disabilities were drawn from a study assessing parents' utilities by using a standard gamble, adjusting for different severities when necessary.31 The study assessed utilities for outcomes of occult bacteremia and meningitis, which resemble closely outcomes of the diseases for which newborns are screened. Specifically, parent utilities were assessed for deafness and various degrees of developmental delay and CP. Respondents in the study were parents of children between 2 and 24 months of age who visited an urban emergency department. These were the most relevant utilities available for these outcomes. Utilities for blindness were taken from a study that developed an equation for estimating utility on the basis of visual acuity.32 We did not adjust utility if a disease state did not result in a disability that was thought to affect quality of life. Base-case estimates for utilities are summarized in Table 1.
The costs for the specific screening tests were derived from 2 sources, ie, a PriceWaterhouseCoopers analysis of newborn screening costs and a prior cost-effectiveness study of MS/MS screening.9,33 These costs incorporated amortized start-up costs and operating costs. The costs of treating selected diseases over the course of a lifetime were estimated with data from the Office of Technology Assessment analysis of strategies for newborn screening, data from a Washington State report on the cost-effectiveness of some screening strategies, and, in 1 case, expert opinion.21,34 Costs of treating sequelae of different diseases were estimated from sources including a Morbidity and Mortality Weekly Report on economic costs associated with mental retardation, CP, hearing loss, and visual impairment; an epidemiologic study of end-of-life care costs; and prior secondary analyses.14,21,35,36 All costs were adjusted to 2004 US dollars with an inflation calculator on the Web site of the Department of Labor, Bureau of Labor Statistics.37 For this study, costs were a complex mixture of medical and nonmedical costs, and we elected to adjust costs with the General Consumer Price Index, rather than the Medical Care Component Consumer Price Index. Base-case estimates for costs are summarized in Table 1.
To reflect patient preferences for health outcomes and having material goods sooner rather than later, we discounted all costs and health effects at an annual rate of 3%.
Calculation of Incremental Cost-Effectiveness Ratios
For each treatment strategy, we calculated the expected total costs by multiplying the probability of each unique outcome with its associated costs and then adding these values for all possible outcomes. Total QALYs for each treatment were calculated in a similar manner. We then calculated the incremental cost-effectiveness of each screening strategy by dividing the difference in costs by the difference in QALYs.
We used 1-way sensitivity analysis to identify important model uncertainties. All probabilities were tested across the full range from 0 to 1. Costs were analyzed from $0 to $1 million. To test the impact of changing multiple assumptions, we developed a “pessimistic” case analysis that biased the model against newborn screening. In the pessimistic case, we eliminated indirect costs from the estimated costs of developmental delay and CP.35 We eliminated the costs of developmental delay from any outcomes that included CP. We increased the cost of MS/MS to $20, the high end of published estimates. We increased the cost of evaluating false-positive screening test results to $1000. We used low-end estimates for the prevalence of MCAD deficiency (1 case per 18000 infants) and PKU (1 case per 20000 infants). We also used a low-end estimate of the risk of death resulting from CAH (3%).
The results of the base-case analysis are shown in Table 2. The first column of Table 2 shows the testing strategies in ascending order of average cost. The second column shows the average cost of each strategy. This includes the cost of the test and the average costs (per person screened) associated with all of the treatments and outcomes for the diseases represented in the model. The third column shows the difference in average costs between the testing strategy and not testing. A negative number means the strategy saves money, on average, over not screening. The fourth column shows the average effectiveness of each strategy in QALYs. The fifth column shows the difference in effectiveness between each strategy and not testing. A positive number means there is a gain in QALYs. The sixth column is the average cost-effectiveness ratio for each strategy, and the seventh column is the incremental cost-effectiveness ratio, the incremental cost per QALY gained. Tests marked as “Dominates” save money and improve outcomes relative to not testing.
All except 2 screening tests dominated the no-testing strategy. That is, these screening tests improved outcomes and reduced costs relative to not screening. The 2 exceptions are screening for CAH, which costs slightly more than $20000 per QALY gained, and screening for galactosemia, which costs approximately $94000 per QALY. The screening test with the lowest expected cost is MS/MS.
In addition to comparing each test with a no-testing strategy, we compared a panel of conventional tests for PKU, biotinidase deficiency, MSUD, galactosemia, and homocystinuria with MS/MS methods to screen for the same conditions and MCAD deficiency. The cost and effectiveness comparison is shown in Table 3. When used in this multiplex manner, MS/MS was less expensive and more effective than the comparable panel of conventional screening tests.
Because most screening tests were dominant over not screening, we conducted sensitivity analysis on all of the variables in the model, to determine at what levels the tests were no longer cost saving. The fourth column of Table 1 shows the threshold value for each variable, at which the corresponding screening strategy becomes more costly than not screening.
In the base-case analysis, we assumed that the sensitivity of all of the screening tests was 100%. When we varied test sensitivity from 0% to 100%, the dominant strategies remained dominant to sensitivities well below those in the literature, as follows: PKU, 10%; CH, 67%; biotinidase deficiency, 13%; MSUD, 83%; homocystinuria, 37%; MS/MS, 27% (Table 4).
Because of the costs of evaluating false-positive results, the dominant testing strategies became more costly as specificity decreased. When the specificity of a screening test was lowered, a larger proportion of unaffected children had false-positive test results, which resulted in increased costs to rule out disease. When the specificity of a test was below a threshold value, some tests were no longer cost saving (Table 1); however, threshold values were below those reported in the literature.
The prevalence of a disease also affected its incremental cost. Biotinidase deficiency, CH, homocystinuria, MSUD, and PKU all became costly when the prevalence was decreased to values between 3 and 10 times lower than baseline values. When the prevalence of CAH was very high (3.5%), screening for CAH became cost saving. The cost of screening for galactosemia was not sensitive to prevalence, and the cost of MS/MS screening was not sensitive to the prevalence of MCAD deficiency.
The results were relatively insensitive to the rates at which individual sequelae occurred. Screening for CH was not cost saving if the rate of mild developmental delay without treatment was less than one tenth of baseline values or if the risk of severe delay was decreased by approximately one half. If the risk of blindness resulting from biotinidase deficiency was <40%, then screening for biotinidase deficiency alone was no longer cost saving. Screening for MSUD alone was not cost saving if, without treatment, the risk of severe CP was <14% or that of severe developmental delay was <34%. Because MS/MS screens for many conditions, it remained cost saving despite changes in any one of these parameters. Testing for galactosemia had a positive net cost in all sensitivity analyses; however, when the risk of neonatal death was >27%, the cost of screening fell below $50000 per QALY saved.
The cost savings with the screening tests depended in some cases on the effectiveness of treatment in preventing specific sequelae. Screening for CH was not cost saving if it was <67% effective in preventing severe delay. Screening for homocystinuria was not cost saving if treatment was <31% effective in preventing blindness. MSUD screening was not cost saving if treatment was <59% effective in preventing CP or <36% effective in preventing delay.
Our results were not sensitive to the costs of screening tests. Screening test costs that exceeded savings from disease costs were between 2 and 10 times reported costs of screening tests. Conversely, screening for galactosemia or CAH was not cost saving even if tests could be performed for free.
Lifetime costs of diseases and their sequelae are difficult to estimate. However, we found that the cost saving for most of these screening tests was either insensitive or sensitive at values significantly below reported rates.
Our pessimistic case analysis biased the model against screening. Under this set of assumptions, newborn screening with MS/MS methods was not cost saving relative to a conventional screening panel, and the screening panel was not cost saving relative to no screening (Table 5). The conventional screening panel cost $310.56 per QALY saved over not screening. Compared with this panel, MS/MS screening cost $4838.71 per QALY saved.
Newborn screening for PKU, CH, biotinidase deficiency, MSUD, and homocystinuria individually not only was cost-effective but actually was cost saving in our base-case analysis. Use of MS/MS to screen for PKU, biotinidase deficiency, MSUD, and homocystinuria, as well as MCAD deficiency, had even greater cost savings because of the multiplicity of conditions detected with a single test. This was true even when MS/MS was compared directly with a panel of available conventional tests for the same conditions. Screening for CAH had a net cost per QALY gained; however the cost was less than the $50000 per QALY used conventionally as a benchmark for cost-effectiveness.38,39
At more than $90000 per QALY, screening for galactosemia generally would not be considered cost-effective. This is because identifying galactosemia early may prevent a neonatal death but doing so induces high lifetime costs because of sequelae. In our sensitivity analyses, we found that under no circumstances would screening for galactosemia be cost saving. However, if the risk of neonatal death resulting from galactosemia is as high as 27%, then screening for galactosemia is cost-effective according to conventional criteria. In addition, the marginal cost of screening for galactosemia with MS/MS or in a panel of newborn screens seems to be cost-effective.
Some results are sensitive to changes in the effectiveness of screening and therapy, which emphasizes the importance of an effective timely system for notification and follow-up evaluation of positive screening tests. Loss of effectiveness in many areas may nullify the cost savings and health benefits of the screen. Similarly, the quality of the screening tests is important, specifically in minimizing false-positive results. The sensitivity of our results to the specificity of tests and the cost of evaluating false-positive results emphasizes that false-positive results must be minimized to realize the cost savings of these screening strategies.
Our pessimistic case analysis was intended to eliminate biases favoring newborn screening. With these assumptions, newborn screening with conventional testing or MS/MS methods was not cost saving. However, the incremental costs per QALY were very low, compared with conventional health economic benchmarks.38,39
As with all cost-effectiveness analyses, our study is limited by the validity of our assumptions about risks, costs, and quality of life. For example, we assumed that data from small cohort studies would predict accurately the occurrence of sequelae in the population at large. We were forced to make some broad assumptions about lifetime costs that could not take into account changes in technology or care. It is also inherently difficult to estimate quality of life across different disease states, especially with respect to children. However, we found that incremental cost-effectiveness was relatively unaffected by even large changes in nearly all of our variables.
Another serious weakness of this study is the cost estimates for most of the conventional screening tests. We used estimates from PriceWaterhouseCoopers.33 This study merely compiled global estimates by various newborn screening programs. There was no effort to divide fixed and marginal costs, and programs were not consistent in which costs were included in their estimates. Nevertheless, these are the best estimates currently available. A more-careful microeconomic analysis of screening program costs would be needed to improve these estimates; however, considering our sensitivity analyses, it seems unlikely that more carefully derived costs would show that newborn screening was not cost-effective.
Newborn screening seems to be one of the rare health care interventions that is beneficial to patients and, with some assumptions, cost saving. Over the long term, funding comprehensive newborn screening programs may save money for society. Even with pessimistic assumptions, newborn screening seems to be highly cost-effective. Although additional refinements of cost estimates may be justified, they are unlikely to change these conclusions materially. Cost-effectiveness analysis will continue to be a useful way to consider which screening tests to include.
This work was conducted under a subcontract from the American College of Medical Genetics (agreement 240-01-0038), with funding from the Maternal and Child Health Bureau, Health Resources and Services Administration. Dr Carroll is supported by a grant from the National Institutes of Health (K23 DK67879-01).
We thank Alex Kemper, MD, MPH, MS, Tracy Lieu, MD, MPH, and Scott Grosse, PhD, for their thoughtful comments on a draft of this manuscript.
- Accepted December 27, 2005.
- Address correspondence to Aaron E. Carroll, MD, MS, Riley Research 330, 699 West Dr, Indianapolis, IN 46074. E-mail:
The authors have indicated they have no financial relationships relevant to this article to disclose.
The views expressed in this article are those of the authors and do not necessarily represent the views of the Indiana University School of Medicine.
- ↵Venditti LN, Venditti CP, Berry GT, et al. Newborn screening by tandem mass spectrometry for medium-chain acyl-CoA dehydrogenase deficiency: a cost-effectiveness analysis. Pediatrics.2003;112 :1005– 1015
- American Academy of Pediatrics, Committee on Genetics. Newborn screening fact sheets. Pediatrics.1989;83 :449– 464
- ↵Guthrie R, Susi A. A simple phenylalanine method for detecting phenylketonuria in large populations of newborn infants. Pediatrics.1963;32 :338– 343
- ↵Schoen EJ, Baker JC, Colby CJ, To TT. Cost-benefit analysis of universal tandem mass spectrometry for newborn screening. Pediatrics.2002;110 :781– 786
- ↵Wood JC, Magera MJ, Rinaldo P, Seashore MR, Strauss AW, Friedman A. Diagnosis of very long chain acyl-dehydrogenase deficiency from an infant's newborn screening card. Pediatrics.2001;108 (1). Available at: www.pediatrics.org/cgi/content/full/108/1/e19
- ↵Keren R, Helfand M, Homer C, McPhillips H, Lieu TA. Projected cost-effectiveness of statewide universal newborn hearing screening. Pediatrics.2002;110 :855– 864
- ↵Jensen UG, Brandt NJ, Christensen E, Skovby F, Norgaard-Pedersen B, Simonsen H. Neonatal screening for galactosemia by quantitative analysis of hexose monophosphates using tandem mass spectrometry: a retrospective study. Clin Chem.2001;47 :1364– 1372
- ↵National Newborn Screening and Genetics Resource Center. National Newborn Screening Report—2000. Austin, TX: National Newborn Screening and Genetics Resource Center; 2003. Available at: http://genes-r-us.uthscsa.edu/resources/newborn/00chapters.html. Accessed June 3, 2004
- ↵American Academy of Pediatrics, Committee on Genetics. Newborn screening fact sheets. Pediatrics.1996;98 :473– 501
- ↵National Newborn Screening and Genetics Resource Center. Homocystinuria. In: National Newborn Screening Report—2000. Austin, TX: National Newborn Screening and Genetics Resource Center; 2003:1– 11. Available at: http://genes-r-us.uthscsa.edu/resources/newborn/00/ch7_complete.pdf. Accessed June 23, 2004
- ↵Washington State Department of Health. Least Burden and Cost Benefit Analysis, Newborn Screening for Metabolic Disorders, WAC 246-650; August 12, 2003. Available at: www.sboh.wa.gov/Meetings/Meetings_2003/2003-10_15/documents/Tab09-NBS_analysis.pdf. Accessed June 23, 2004
- Cruysberg JR, Boers GH, Trijbels JM, Deutman AF. Delay in diagnosis of homocystinuria: retrospective study of consecutive patients. BMJ.1996;313 :1037– 1040
- Naughten ER, Jenkins J, Francis DE, Leonard JV. Outcome of maple syrup urine disease. Arch Dis Child.1982;57 :918– 921
- Scriver CR. The Metabolic and Molecular Bases of Inherited Disease. 8th ed. New York, NY: McGraw-Hill; 2001
- Shield JP, Wadsworth EJ, MacDonald A, et al. The relationship of genotype to cognitive outcome in galactosaemia. Arch Dis Child.2000;83 :248– 250
- ↵Pollitt RJ, Leonard JV. Prospective surveillance study of medium chain acyl-CoA dehydrogenase deficiency in the UK. Arch Dis Child.1998;79 :116– 119
- ↵Bittles AH, Petterson BA, Sullivan SG, Hussain R, Glasson EJ, Montgomery PD. The influence of intellectual disability on life expectancy. J Gerontol A Biol Sci Med Sci.2002;57 :M470– M472
- ↵PriceWaterhouseCoopers. Newborn Screening Programs: An Overview of Cost and Financing. New York, NY: PriceWaterhouseCoopers; 2002. Available at: www.marchofdimes.com/files/Final_PWC_NBS_Report2.pdf. Accessed June 3, 2004
- ↵US Congress, Office of Technology Assessment. Data and methods used in OTA's cost-effectiveness analysis of strategies for newborn screening. In: Healthy Children: Investing in the Future. Washington, DC: US Government Printing Office; 1988:236– 241. Publication OTA-H-345. Available at: www.wws.princeton.edu/cgi-bin/byteserv.prl/∼ota/disk2/1988/8819/881919.PDF. Accessed June 3, 2004
- ↵US Department of Labor, Bureau of Labor Statistics. Inflation calculator. Available at: www.bls.gov/cpi. Accessed July 14, 2004
- ↵Goldman L, Gordon DJ, Rifkind BM, et al. Cost and health implications of cholesterol lowering. Circulation.1992;85 :1960– 1968
- ↵Hirth RA, Chernew ME, Miller E, Fendrick AM, Weissert WG. Willingness to pay for a quality-adjusted life year: in search of a standard. Med Decis Making.2000;20 :332– 342
- Copyright © 2006 by the American Academy of Pediatrics