BACKGROUND. The advent of tandem mass spectrometry has made it possible to test newborns for multiple conditions efficiently. It is not known how state newborn screening programs have changed screening practices in response to this technology and how it affects the number of false-positive test results.
METHODS. We obtained data from the National Newborn Screening and Genetics Resource Center regarding the screening practices for each of the 50 states, to determine the number of mandated disorders added to state newborn screening panels between 1995 and 2005. Combining these data with reported specificities from the literature and the number of births in each state, we estimated the number of infants who would have received false-positive results through screening with tandem mass spectrometry in 2005.
RESULTS. The average state mandated screening for 5 disorders in 1995 (range: 0–8 disorders). Wyoming was the only state that decreased its panel size over the next decade. Kansas and Texas were the only states that did not add disorders to their panels between 1995 and 2005; the average state added 19. Iowa, Minnesota, Mississippi, South Dakota, and Tennessee each added ≥40 disorders. Assuming that an individual test for a disorder had a specificity of 99.995%, we estimated that ∼2575 infants would have received false-positive results through screening with tandem mass spectrometry in 2005. If specificity was assumed to be 99.9%, then the number increased to >51000.
CONCLUSIONS. State newborn screening programs have expanded dramatically in the past decade. Because the benefit of such testing may be unclear in some cases and because the number of infants who may receive false-positive results and may be labeled falsely as having disease is potentially sizeable, a more cautious approach is needed.
In the United States, decisions about newborn screening are made by 50 different state legislatures. Shortly after Massachusetts began a mass voluntary screening program for phenylketonuria in 1962, other states passed legislation mandating newborn phenylketonuria screening.1 Currently, all states have laws mandating that every newborn be screened for selected inborn errors of metabolism and endocrine disorders (among the most common are phenylketonuria, congenital adrenal hyperplasia, galactosemia, homocystinuria, hypothyroidism, and maple syrup urine disease).2 However, because each state maintains the authority to decide which tests are mandated, the disorders screened vary from state to state.
Approximately 1 decade ago, the advent of tandem mass spectrometry made it possible for states to test newborns for multiple disorders more efficiently. Previous testing methods, such as chromatography and bacterial inhibition assays, were able to detect only 1 disorder at a time. Now, with a single drop of dried blood, a tandem mass spectrometer can detect numerous metabolites that accumulate with various inborn errors of metabolism.3,4 Tandem mass spectrometry is quick (∼1–2 minutes),5 relatively inexpensive ($10–$30),5,6 and readily scalable, because increasing the number of conditions screened does not change the time or price substantially.5
How the state newborn screening programs have changed their screening practices in response to the opportunities presented by this technology has not been evaluated systematically. In this article, we examine the newborn screening programs in each of the 50 states and we explore how the number of mandated disorders screened has changed in the past decade. We also consider the number of infants who would receive false-positive results and so might be labeled falsely as having disease, given the current number of mandated tests that are performed with tandem mass spectrometry. In light of recent calls for additional research on expanded newborn screening,7,8 we think that this article represents an important first step in understanding the potential scope of false-positive results with expanded newborn screening.
State Screening Panels
To determine how the number of disorders on state newborn screening panels has changed, we obtained data from the National Newborn Screening and Genetics Resource Center (formerly the Council on Regional Networks for Genetic Services). This organization monitors the screening practices of each state as part of a cooperative agreement between the Maternal and Child Health Bureau (part of the US Department of Health and Human Services) and the Department of Pediatrics at the University of Texas Health Science Center at San Antonio. We used the organization's National Newborn Screening Reports to determine the number of disorders in each state's newborn screening panel during 1995 and as of November 1, 2005. A full list of the disorders screened is available from the authors.
Our approach to counting the number of disorders screened was as follows: (1) only disorders screened with dried blood spots were counted (eg, hearing tests were excluded), (2) only universally mandated disorders were counted (ie, disorders screened on a voluntary basis, those offered only to select populations, and those detected as a byproduct of screening with multiple-reaction monitoring were excluded), (3) the entry “variant hemoglobins” was excluded from our count because the report did not indicate the number of variants screened, and (4) the counting method was consistent for 1995 and 2005. This latter criterion was necessary because some disorders were recorded as a single disorder in 1995 but as multiple disorders in 2005 (eg, hemoglobinopathies in 1995/sickle cell disease, S-β-thalassemia, and sickle-C disease in 2005; galactosemia in 1995/classic galactosemia, galactose epimerase deficiency, and galactokinase deficiency in 2005; and phenylketonuria in 1995/benign hyperphenylalaninemia, defects of biopterin cofactor biosynthesis and regeneration, and phenylketonuria in 2005). To make the count consistent across years, and because we were unable to determine from the 1995 National Newborn Screening Report which states screened for the additional disorders listed in 2005, we counted hemoglobinopathies, galactosemia, and phenylketonuria each as 1 disorder in both 1995 and 2005.
When states mandated screening for subtypes and variants, we counted these as individual disorders (eg, cobalamin A/B, cobalamin C/D, and methylmalonyl mutase were counted as 3 separate disorders). We did this because an infant has the potential to be diagnosed with any of these disorders, and this potential varies according to state.
Counting of Individual Tests
We counted each disorder as a single test unless multiple disorders were screened through detection of identical metabolites by tandem mass spectrometry. In such cases, we counted the group of disorders as 1 test. To determine which metabolites are evaluated in screening for the various disorders, we obtained metabolite screening profiles from the Pediatrix Screening Laboratory (www.pediatrix.com), a private laboratory that specializes in newborn screening with tandem mass spectrometry. Profiles not available from Pediatrix were obtained from states that screened for the particular disorder. We found the following disorders to be screened with the same metabolites (grouped according to identical metabolites): phenylketonuria, benign hyperphenylalaninemia, and defects of biopterin biosynthesis and regeneration; sickle cell disease, S-β-thalassemia, and sickle-C disease; classic galactosemia (transferase deficient), galactose epimerase deficiency, and galactokinase deficiency; tyrosinemia types I, II, and III; homocystinuria and hypermethioninemia; long-chain hydroxyacyl-CoA dehydrogenase deficiency and trifunctional protein deficiency; isovaleric acidemia and 2-methylbutyryl-CoA dehydrogenase deficiency; short-chain acyl-CoA dehydrogenase deficiency, isobutyryl-CoA dehydrogenase deficiency, and ethylmalonic encephalopathy; carnitine acylcarnitine translocase deficiency and carnitine palmitoyl transferase deficiency type 2; 3-hydoxy 3-methylglutaryl-CoA lyase deficiency, 3-methylcrotonyl-CoA carboxylase deficiency, β-ketothiolase deficiency, multiple carboxylase deficiency, and 3-methylglutaconic aciduria; argininosuccinate acidemia, citrullinemia type I, and citrullinemia type II; methylmalonic acidemias (including cobalamin A/B, cobalamin C/D, and methylmalonyl-CoA mutase) and propionic acidemia.
Specificity of Individual Tests
We performed a literature search for publications that provided specificities for individual tests performed with tandem mass spectrometry for newborn screening. We searched Medline from 1966 to May 1, 2005, using the following search strategies with limits to English-language publications: (1) the MeSH headings “spectrometry, mass, electrospray ionization” and “sensitivity and specificity” and “neonatal screening” and (2) the keywords “tandem mass spectrometry” and “specificity” and “newborn screening.” Our search strategy returned 54 citations. We then reviewed the bibliographies of all relevant articles to find additional studies of specificity.
We reported the “gold standard” for each specificity as laboratory confirmation if it entailed serum, urine, or enzymatic analysis or comparison with results of another laboratory technique (such as fluorometry) and as genetic confirmation if it involved DNA analysis. Our findings for the various individual tests are summarized in Table 1.
We also made use of population-based studies of expanded newborn screening. For each of these studies, an overall specificity was provided for all of the tests performed. To compute the expected specificity for an individual test, we raised the overall specificity to the reciprocal of the number of tests performed [individual test specificity = overall specificity(1/no. of tests)]. Our findings on population-based specificities are summarized in Table 2.
On the basis of the forgoing information, we used the following 3 specificities to estimate the expected number of false-positive results: 99.995%, our best-case scenario for specificity and the imputed specificity based on the largest reported cohort undergoing tandem mass spectrometric testing (362000 newborns in Australia); 99.95%, our intermediate scenario for specificity; and 99.9%, our worst-case scenario and the 25th percentile of reported individual test specificities but within the range of what might be expected in the real-world setting, given the differences that might exist between highly selected laboratories and those operating in the community.
Number of False-Positive Results
To estimate the number of expected false-positive tests in each state, we combined data on (1) the number of screening tests performed with tandem mass spectrometry that were mandated by the state in 2005, (2) the 3 scenarios for the specificity of an individual test (as described above), and (3) the number of births in each state (from the National Vital Statistics Reports preliminary birth data for 2004).9 The vast majority of the specificities from the literature that we used to estimate false-positive results (including the specificity for our best-case scenario) were determined from positive results that required either obtaining a repeat specimen from the newborn or referring the newborn for additional confirmatory testing.
For each state, we used the following equation: no. of false-positive results = [1 − [1 − (1 − prevalence)(1 − specificity)]k] × (no. of births in state), where k represents the number of tests performed with tandem mass spectrometry and prevalence refers to the average prevalence for an individual disorder screened with tandem mass spectrometry. We calculated the average prevalence for an individual disorder (1 case per 111997 infants) from a pilot study of expanded newborn screening with tandem mass spectrometry in New England,10 where the overall prevalence for all disorders screened was 1 case per 4870 infants. We used the following equation to calculate the average prevalence for an individual disorder: average prevalence for individual disorder = 1 − (1 − overall prevalence)(1/no. of disorders).
Because the most recent birth data are from 2004 and the numbers of births are increasing in most states, our estimates of the number of infants who receive false-positive results in 2005 are biased downward. Because the National Newborn Screening Reports published since the expansion of newborn screening do not contain data on the number of false-positive results, we were not able to compare our estimates with the actual number of recorded false-positive results.
The average state mandated screening for 5 disorders in 1995 (range: 0–8 disorders). Table 3 shows that only 1 state (Wyoming) decreased the number of disorders on its newborn screening panel between 1995 and 2005. Kansas and Texas were the only states that did not add any disorders over the decade. The average state added 19 tests (median: 22 tests). Nineteen states added ≥30 disorders during the period; 5 states (Iowa, Minnesota, Mississippi, South Dakota, and Tennessee) added ≥40. Figure 1 shows that the states with the most rapid growth in testing were generally in the coastal regions and the Midwest, whereas states in the Great Plains and Rocky Mountain regions displayed more modest growth.
With the best-case scenario for the specificity of an individual test (99.995%), we found that ∼2575 infants in the United States would receive false-positive results each year and so might be labeled falsely as having disease. With the intermediate scenario (specificity: 99.95%), we estimated that ∼25644 infants would receive false-positive results. With the worst-case scenario (specificity: 99.9%), the number increased to >51059 (Table 4).
The advent of tandem mass spectrometry has allowed state newborn screening programs to increase dramatically the number of mandated disorders screened in the past decade. We found that, from 1995 to 2005, the typical state nearly quadrupled the number of mandated disorders screened. This growth has profound consequences. Using commonly cited specificities for tandem mass spectrometry, we estimated that between 2575 and 51059 infants receive false-positive results each year, and so may be labeled falsely as having disease, through screening with tandem mass spectrometry. We refer to the infants who receive false-positive results as having the potential to be labeled falsely as having disease because studies suggest that some parents of these infants remain anxious about their child's health, perceive the child as unhealthy, and, as a consequence, treat the child differently even after a result is deemed a false-positive finding.11–16
Our study is limited because the literature on the specificity of newborn screening tests is incomplete. Specificity estimates are simply not readily available for every test now used in state newborn screening panels. Furthermore, frequently the available data are of questionable quality; the standard method may not be specified and/or the reporting may not be explicit (ie, specificity needed to be calculated with values that appeared in various passages in the text). We found no data that addressed the question of whether the probability of an infant receiving a false-positive result for one condition influences the probability of receiving a false-positive result for other conditions.
To deal with these data limitations, we were forced to make a number of simplifying assumptions. First, to deal with the incomplete (and probably unreliable) database on the specificity of individual tests, we assumed the same specificity (derived from reported specificities for tandem mass spectrometry) for every test in a state's panel. Although not all tests are performed with tandem mass spectrometry, the majority are. We addressed the uncertainty about the true specificities by testing a range of values that are plausible, on the basis of the available data. Given the limited data available, we were conservative in our choice of specificities for the best-case and intermediate scenarios, and we are confident that most of the individual specificities, if known, would lie either at or below these specificity estimates. We suspect that the true number of false-positive results, if known, would lie somewhere between our best-case and worst-case scenarios. Second, we assumed that the probabilities of receiving false-positive results for the various tests are independent. We are confident that our most conservative estimate, ∼2575 cases per year, can be viewed as a best-case scenario for the number of false-positive results that occur currently. However, as states continue to add more tests to their state panels, to enroll a majority of the population in pilot/voluntary programs, and to identify disorders as byproducts of multiple-reaction monitoring screening,2 our predictions will underestimate the number of false-positive results obtained.
We were unable, in the context of this study, to address specifically the consequences experienced by the parents of infants who might have a false-positive test. Generally, when an initial abnormal result is detected, the laboratory contacts the infant's physician, who then notifies the parents.17 The urgency and extent of follow-up testing depend on the abnormality. For some families, the journey may end with a repeated blood spot specimen that yields normal results. We suspect, however, that the journey from initially positive results to subsequently normal results is not an easy one for some families. The ultimate diagnosis may be difficult to achieve.18,19 Positive test results may initiate diagnostic cascades, when confirmatory testing produces an ambiguous result and one test stimulates another as more questionable abnormalities are found.20 Although some have argued that newborn screening will help to avoid this problem later in life,21 there is at least some reason to be concerned that it will increase the number of diagnostic cascades early in life.
Furthermore, studies suggest that false-positive results have sequelae even after subsequent testing confirms that the infant is normal. More than one third of parents still have concerns about the health of their infant,11 on average, mothers report more stress,16 the average parent-child relationship is more dysfunctional,16 and infants are hospitalized more frequently for unrelated illnesses.16 Research suggests that certain steps might lessen the potential anxiety of families that is triggered by a false-positive result.16 It is possible that some parents are not well prepared to deal with the screening results. Studies have demonstrated that education of parents about false-positive results is lacking.22 Moreover, although many states do not require informed consent for newborn screening, a few do not even require parental notification before the initial specimen is obtained.17 As we proceed with newborn screening in this country, we must provide both sufficient and appropriate resources to families.
By focusing solely on false-positive results, we have not considered the infants who have benefited from screening. Although it is not clear that early detection decreases morbidity or mortality rates for all of the disorders being screened,23 some infants may benefit from expanded newborn screening. Furthermore, knowing how many infants are identified as having disease through expanded newborn screening would allow us to place the absolute number of false-positive results in a useful relative context, such as a positive predictive value.24 Unfortunately, recent National Newborn Screening Reports2,25 do not contain data on true-positive results with expanded newborn screening, and producing estimates is difficult because, for many conditions, prevalence data are unknown, vary widely, or are based heavily on case report findings.21
It is also important to note that, although some infants may benefit from screening, other infants may be diagnosed as having pseudodisease, that is, infants whose positive test results are “true” in confirmatory testing but who are diagnosed with a disease that will never become clinically apparent. These infants and their parents suffer the harms of unnecessary diagnosis (with its attendant psychological burden) and unnecessary treatment. Although the concept is relatively unfamiliar, pseudodisease has been reported in the context of newborn screening.26–28 Additional research regarding the natural history and phenotypic variation of some of these diseases is needed.
Given some of the uncertainties about benefits and the unintended side effects of false-positive tests and pseudodisease, we urge a more cautious approach to newborn screening. There is much we need to learn, and additional research is critical. As a first step, better estimates of actual specificities achieved in real-world settings are needed. We hope that all states will document both their true-positive and false-positive results. It also will be helpful to examine more closely the course of events after infants receive false-positive results. Their care accrues a cost to newborn screening that needs to be quantified adequately. Finally, research also should be directed toward the questions of benefits and pseudodisease, perhaps by making use of the natural experiment afforded by states with very different screening practices. Because newborn screening can be a double-edged sword, researchers and policymakers must be attentive to both benefit and harm. Undoubtedly, the tradeoffs will be challenging; we must be careful to avoid harming thousands of children inadvertently in an effort to help a few.
This study was supported by the Robert Wood Johnson Foundation through the Robert Wood Johnson Clinical Scholars Program and by a Research Enhancement Award from the Department of Veterans Affairs (grant 03-098).
We thank Tom Koepsell, MD, MPH, for his helpful suggestions and guidance regarding methods. We thank Wylie Burke, MD, PhD, for her thoughtful comments and suggestions regarding preparation of the manuscript. We thank Michael Glass, MS, Director of Newborn Screening for Washington State Department of Health, for his guidance regarding technical aspects of tandem mass spectrometry and newborn screening. We thank Donna Williams of the National Newborn Screening and Genetics Resource Center for her assistance in obtaining the National Newborn Screening Reports.
- Accepted February 23, 2006.
- Address correspondence to Beth A. Tarini, MD, University of Michigan Health System, General Pediatrics, 300 N Ingalls St, Room 6C11, Ann Arbor, MI 48109-0456. E-mail:
The authors have indicated they have no financial relationships relevant to this article to disclose.
The views expressed herein do not necessarily represent the views of the Robert Wood Johnson Foundation, the Department of Veterans Affairs, or the US government.
Dr Tarini had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
- ↵National Newborn Screening and Genetics Resource Center. National Newborn Screening Status Report, November 2005. Austin, TX: National Newborn Screening and Genetics Resource Center; 2005
- ↵Bartlett K, Eaton SJ, Pourfarzam M. New developments in neonatal screening. Arch Dis Child Fetal Neonatal Ed.1997;77 :F151– F154
- ↵Baylor Health Care System. Metabolic disease: about supplemental newborn screening. Available at: www.bhcs.com/MedicalSpecialties/MetabolicDisease/newbornscreening.htm#20. Accessed July 11, 2005
- ↵Botkin JR. Research for newborn screening: developing a national framework. Pediatrics.2005;116 :862– 871
- ↵Zytkovicz TH, Fitzgerald EF, Marsden D, et al. Tandem mass spectrometric analysis for amino, organic, and fatty acid disorders in newborn dried blood spots: a two-year summary from the New England Newborn Screening Program. Clin Chem.2001;47 :1945– 1955
- ↵Sorenson JR, Levy HL, Mangione TW, Sepe SJ. Parental response to repeat testing of infants with “false-positive” results in a newborn screening program. Pediatrics.1984;73 :183– 187
- Tymstra T. False-positive results in screening tests: experiences of parents of children screened for congenital hypothyroidism. Fam Pract.1986;3 :92– 96
- ↵Mandl KD, Feit S, Larson C, Kohane IS. Newborn screening program practices in the United States: notification, research, and consent. Pediatrics.2002;109 :269– 273
- ↵Schulze A, Lindner M, Kohlmuller D, Olgemoller K, Mayatepek E, Hoffmann GF. Expanded newborn screening for inborn errors of metabolism by electrospray ionization-tandem mass spectrometry: results, outcome, and implications. Pediatrics.2003;111 :1399– 1406
- ↵Carpenter K, Wiley V, Sim KG, Heath D, Wilcken B. Evaluation of newborn screening for medium chain acyl-CoA dehydrogenase deficiency in 275000 babies. Arch Dis Child Fetal Neonatal Ed.2001;85 :F105– F109
- ↵American College of Medical Genetics. Newborn Screening: Toward a Uniform Screening Panel and System. Rockville, MD: American College of Medical Genetics; 2005
- ↵Fant KE, Clark SJ, Kemper AR. Completeness and complexity of information available to parents from newborn-screening programs. Pediatrics.2005;115 :1268– 1272
- ↵National Newborn Screening and Genetics Resource Center. National Newborn Screening Report, 2000. Austin, TX: National Newborn Screening and Genetics Resource Center; 2003
- Lam WK, Cleary MA, Wraith JE, Walter JH. Histidinaemia: a benign metabolic disorder. Arch Dis Child.1996;74 :343– 346
- ↵Fingerhut R, Roschinger W, Muntau AC, et al. Hepatic carnitine palmitoyltransferase I deficiency: acylcarnitine profiles in blood spots are highly specific. Clin Chem.2001;47 :1763– 1768
- ↵Chace DH, Hillman SL, Van Hove JL, Naylor EW. Rapid diagnosis of MCAD deficiency: quantitative analysis of octanoylcarnitine and other acylcarnitines in newborn blood spots by tandem mass spectrometry. Clin Chem.1997;43 :2106– 2113
- Chace DH, DiPerna JC, Kalas TA, Johnson RW, Naylor EW. Rapid diagnosis of methylmalonic and propionic acidemias: quantitative tandem mass spectrometric analysis of propionylcarnitine in filter-paper blood specimens obtained from newborns. Clin Chem.2001;47 :2040– 2044
- ↵Newborn Screening Committee. National Newborn Screening Report, 1995. Atlanta, GA: Council of Regional Networks for Genetic Services; 1999
- Copyright © 2006 by the American Academy of Pediatrics