BACKGROUND AND OBJECTIVE: Late preterm infants (LPIs) (gestation 34 weeks and 0 days to 36 weeks and 6 days) compared with full-term infants (FTIs) are at increased risk for mortality and short- and long-term morbidity. The objective of this study was to assess the neurodevelopmental outcomes in a longitudinal cohort study of LPIs from infancy to school age and determine predictive values of earlier developmental testing compared with school-age testing.
METHODS: We used general estimating equations to calculate the odds of school readiness in a nationally representative cohort of 4900 full-term and 950 late preterm infants. We generated positive and negative predictive values of the ability of the 24-month Mental Developmental Index (MDI) scores of the Bayley Short Form, Research Edition, to predict Total School Readiness Score (TSRS) at kindergarten age.
RESULTS: In multivariable analysis, late preterm infants had higher odds of worse TSRSs (adjusted odds ratio 1.52 [95% confidence interval 1.06–2.18], P = .0215). The positive predictive value of a child having an MDI of <70 at 24 months and a TSRS <5% at kindergarten was 10.4%. The negative predictive value of having an MDI of >70 at 24 months and a TSRS >5% was 96.8%. Most infants improved score ranking over the study interval.
CONCLUSIONS: LPIs continue to be delayed at kindergarten compared with FTIs. The predictive validity of having a TSRS in the bottom 5% given a MDI <70 at 24 months was poor. A child who tested within the normal range (>85) at 24 months had an excellent chance of testing in the normal range at kindergarten.
- BSF-R —
- Bayley Short Form, Research Edition
- BSID-II —
- Bayley Scales of Infant Development, Second Edition
- ECLS-B —
- Early Childhood Longitudinal Study, Birth Cohort
- FTI —
- full-term infant
- LPI —
- late preterm infant
- MDI —
- Mental Development Index
- TSRS —
- Total School Readiness Score
What’s Known on This Subject:
Late preterm infants, compared with full-term infants, have less proficiency in reading and math at school age, with increased need for individualized educational plans and special education services. They also have lower cognitive performance on standardized IQ exams.
What This Study Adds:
Late preterm infants have worse outcomes at school entry, and development is variable during the preschool years, so socioeconomic status, language spoken in the home, maternal education, maternal race, and being a late preterm infant have a large impact.
Late preterm infants (LPIs) make up nearly 75% of all preterm American births1 and 20% to 25% of NICU admissions.2 Compared with full-term infants (FTIs), LPIs are at increased risk for neonatal mortality and morbidity3–7 and have less proficiency in reading and math at school age,8 increased rates of cerebral palsy and mental retardation,9 lower cognitive performance on standardized IQ exams,10 and more teacher-reported behavior problems.11 As a result, LPIs have increased need for individualized educational plans and special education services.11,12
Prediction of which LPIs will experience potential future challenges requires longitudinal assessment from infancy to school age, to compare earlier developmental testing with school age outcomes and identify early signs for targeting with early intervention. Our previous work using the Early Childhood Longitudinal Study, Birth Cohort (ECLS-B), found that LPIs had worse neurodevelopmental outcomes than FTIs at 24 months.13 In this same cohort, we here assess the school readiness at kindergarten of LPIs compared with FTIs, hypothesizing that LPIs will be less school ready. We also assess the predictability of early developmental testing compared with school-age testing.
The ECLS-B is described in detail elsewhere.13 Briefly, the ECLS-B is a prospective cohort study of a nationally representative sample of children born in 2001 that focuses on early home and educational experiences. The ECLS-B was sponsored by the US Department of Education’s National Center for Education Statistics in the Institute of Education Sciences, in collaboration with several federal education and health policy agencies. The ECLS-B oversamples American Indians, Asians, Pacific Islanders, low birth weight infants, and twins.
The ECLS-B contains multiple sequential direct and indirect assessments of each child’s competencies, skills, and physical growth. Data were extracted from the birth certificate, derived from parental interviews (when the infants were 9 months, 2 years, and 6 years of age), and obtained from direct cognitive assessments. The children were tested in their homes by trained personnel. Testing and scoring were validated through in-person quality control visits and review of videotaped assessments.
The direct cognitive assessment was designed to be a broad measure of the child’s knowledge and motor skills. This assessment was adaptive in nature, meaning that each child began with a routing test that was followed by a second-stage test. Depending on the number of correct responses on the routing test, the second-stage test was 1 of 3 possibilities that varied according to the child’s developmental level. The direct cognitive assessment tested children’s skills in expressive language, early reading, and math. Testing was done in English or Spanish depending on the primary language indicated. No other languages were available.
We studied infants of ≥34 weeks’ completed gestation who had complete cognitive assessments during their kindergarten year. We included twins and higher-order multiples and excluded infants who were inadequately assessed or not assessed because of a major congenital anomaly (Appendix 1), blindness, or deafness.
Direct Cognitive Assessment Measures
Total School Readiness Score
The primary outcome measure was the Total School Readiness Score (TSRS), a composite measure derived from the individual tests of the cognitive assessment battery conducted during the kindergarten year. Each of the tests in the cognitive battery was individually validated with field tests. The battery included reading, math, and expressive language testing. Each individual score for reading, math, and expressive language was weighted equally and then combined to arrive at the TSRS. Lower scores correlate with worse school readiness. Infants were considered significantly impaired if their TSRS was in the bottom fifth percentile, because scores this low represent a conservative measure of severe impairment.
The early reading constructs included English-language skills, oral skill, phonological awareness, letter and letter-sound knowledge, print conventions, word recognition, and vocabulary. Some items assessed children’s early writing skills. The reliability score for internal consistency of testing was 0.92.14
The primary mathematics constructs assessed included an understanding of numbers including cardinality, ordinality, quantity, operations, and estimation; measurement; the ability to compare objects by their attributes; geometry and spatial sense; and skills of collecting, organizing, reading, and representing data. The reliability score for internal consistency of testing was 0.92.14
Expressive Language Assessment
Expressive language was assessed by reading children stories using picture books and having the children retell the story to the examiner. Audiotapes of this test were sent to a central facility where specially trained coders scored them.
At 24 months, the Mental Developmental Index (MDI) of the Bayley Short Form, Research Edition (BSF-R), was used. The BSF-R was derived from the Bayley Scales of Infant Development, Second Edition (BSID-II) (mean 100 ± 15). Similar to the BSID-II, the BSF-R has a core set of items that are administered to all children and optional supplementary basal and ceiling item sets. The BSID-II generates a raw score that is converted to the standardized MDI. Item response theory modeling was used to estimate the BSID-II raw scores from the BSF-R. The overall reliability coefficient for the BSF-R MDI was 0.98.
The BSF-R was tested extensively to ensure that the psychometric properties of the BSID-II were maintained and that it successfully measured children’s abilities across the entire distribution. A score of <70 equates to significant developmental delay. The BSF-R was administered in the child’s home by trained personnel. Each administrator’s testing and scoring abilities were validated through in-person quality control visits and videotaped interviews.15
Detailed Maternal, Infant, and Home Characteristics
Maternal and infant descriptive characteristics were obtained from birth certificates and maternal surveys. Maternal race/ethnicity was characterized as white non-Hispanic, black non-Hispanic, or other/unknown. Adequacy of prenatal care was based on the Kotelchuck index (1994).16,17 This validated index uses the week of pregnancy that prenatal care was initiated, total number of prenatal visits, and length of gestation together in an algorithm where prenatal care is characterized as “inadequate,” “intermediate,” “adequate,” and “adequate plus.”
Infants were either LPI (completed gestation from 34 weeks and 0 days to 36 weeks and 6 days) or FTI (completed gestation ≥37 weeks). Birth weight in grams was categorized as <750, 750 to 1499, 1500 to 2499, 2500 to 2999, 3000 to 3499, 3500 to 3999, or ≥4000. Adjusting for the child’s gender and race/ethnicity, we defined small for gestational age as <10th percentile birth weight and large for gestational age as ≥90th percentile per Alexander et al.18 When infants were 9 months of age, mothers were asked whether they ever fed their children breast milk. Infants were described as being part of a multiple gestation (plurality) or singleton.
Households were identified as primarily English-speaking (See Appendix 2 for full list of languages). When the household income-to-size ratio was below the 2002 Census poverty threshold, the household was labeled “impoverished.”
All unweighted sample sizes included in this analysis were rounded to the nearest 50 to protect the confidentiality of respondents as specified in the restricted data license agreement. We used the weights provided in the ECLS-B manual to adjust for survey design and allow for accurate population estimates of testing scores. We used these weights in the SurveyFreq analysis to make appropriate adjustments to the standard errors.
Maternal, Home, and Child Descriptive Characteristics
Maternal, home, and child descriptive characteristics were compared in bivariate analysis using the t test for 2-sample comparison of continuous data, χ2 analysis for multiple-sample comparison of categorical data, and analysis of variance for continuous variable multiple-sample comparisons. Maternal characteristics were compared so that each mother was represented once, avoiding undo weight being given to mothers of multiples.
For multivariable analysis, we used generalized estimating equation models to generate odds ratios and 95% confidence intervals. Conceptually similar to logistic regression, generalized estimating equation models differ because they account for clustering of data. Because multiples share genetic, in utero, and environmental factors related to neurodevelopmental outcomes, they are correlated and represent clustered data. To examine factors related to improvement in cognitive scores from early childhood to school age, multivariable analysis was used to examine factors associated with the presence or absence of improved cognitive outcomes, controlling for common factors (maternal race, education, marital status, prenatal care, primary language, impoverished household, gender, fetal growth, plurality, delivery type, gestational age, and any breast milk feeding) that are known to affect developmental outcomes.
Positive and negative predictive values of the ability of the MDI score at 24 months to anticipate the performance at kindergarten were calculated as a proportion for the total population and for subgroups of LPIs and FTIs.
All statistical tests were 2-tailed, and the level of significance was set at 0.05. All the analyses used SAS 9.1 statistical software (SAS Institute, Cary, NC).
At the kindergarten assessment, there were 5850 FTIs and LPIs, representing a 78% follow-up from the 24-month cohort. Fewer than 50 infants were excluded because of congenital anomaly or blindness. Another 300 infants were excluded because they lacked home assessments. Infants who were assessed at 24 months but not at kindergarten were more likely to be FTI (85%) and born to white, primarily English-speaking mothers who had a higher education and were married. The missing children were also more likely to have scored in the normal range or higher on the 24-month BSF-R. The mean age at assessment was equivalent for the 2 groups: 65.2 (SD 3.8) months for LPIs and 65.1 (3.8) months for FTIs (P = .1849).
Maternal and Child Characteristics
Maternal and infant descriptive characteristics are shown in Tables 1 and 2. Mothers of LPIs compared with FTIs were more likely to be black, have a high school degree, be unmarried, and meet criteria for being impoverished. They also had higher prenatal care utilization and required more cesarean deliveries. LPIs were more likely to be the product of a multiple gestation (plurality), not breastfed, and large for gestational age.
Direct Cognitive Assessment Results
TSRSs ranged from 31.5 to 285.8, with a mean of 173.2 (43.3). Reading scores ranged from 15 to 100, with a mean of 48.3 (15.6). Math scores ranged from 15.9 to 100, with a mean of 59.1 (15.6). Expressive language scores ranged from 0 to 100, with a mean of 67.1 (17.3).
LPIs compared with FTIs had lower mean TSRSs: 164.9 (43.5) and 175.2 (43.1), respectively. LPIs had significantly (P < .05) lower mean scores in all of the subscales of the TSRS, including reading, math, and expressive language scores.
In multivariable analysis, the adjusted odds ratios for LPIs were as follows: overall TSRS of <5% 5.25 (95% confidence interval 1.6–8.9; P = .0048), reading 1.43 (0.2–2.7; P = .0226), math 1.17 (0.3–2.1; P = .0131), and expressive language 0.09 (0.06–1.2; P = .0248).
The odds of an LPI compared with an FTI having a score in the bottom 5% for overall TSRS was 1.52 (1.06–2.18; P = .0215), and for reading, 1.76 (1.21–2.58; P = .0035). These observations were unlike the math and expressive language subscales (Fig 1).
If a child’s MDI score was <70 or 70 to 84 at 24 months of age, that child was at increased odds of having a TSRS in the bottom 5% (1.52 [1.14–1.90; P < .0001] and 1.05 [0.66–1.45; P < .0001]). If a child’s MDI score was <85, he or she was also more likely to have reading, math, and expressive language scores in the bottom 5%. Other factors that were independently associated with having scores in the bottom 5% at kindergarten were younger age at assessment, living in an impoverished household, primary language other than English, lower maternal education, and black maternal race. Being late preterm was an independent risk factor for having TSRS and reading score in the bottom 5%.
When comparing scores at 24 months and kindergarten (Table 3), 14.4% of LPIs and 9.3% of FTIs who had significant developmental delay at 24 months also scored <5% at kindergarten. The majority of LPIs improved score ranking between the 2 time points. The incidence of having TSRS <5% decreased as 24-month MDI score increased, so 98% of children who had a normal MDI at 24 months had a TSRS >5% at kindergarten.
The positive predictive value (ie, the probability of a child having an MDI of <70 at 24 months and a TSRS <5% at kindergarten) was 10.4% for the cohort, 9.3% for FTIs, and 14.4% for LPIs (Table 4). The negative predictive value (ie, the probability of having an MDI of >70 at 24 months and a TSRS >5%) was very high: 96.8% for the total cohort, 88.9% for LPIs, and 92.1% for FTIs.
In this nationally representative cohort of 5- to 6-year-olds born in 2001, we found that LPIs had significantly worse total school readiness, reading, math, and expressive language scores compared with FTIs. LPIs have higher odds of severe impairment, and those who have significant developmental delay at 24 months have increased odds of being severely impaired at school age. The predictive validity of having a TSRS in the bottom 5% given an MDI <70 at 24 months was poor. A child who tested within the normal range (ie, >85) at 24 months had an excellent chance of testing >5% at kindergarten. The majority of children improved in ranking between 24 months and kindergarten. In this cohort, late preterm children living in an impoverished household with primary language other than English, lower maternal education, and black maternal race were also at increased risk for TSRS <5% at kindergarten.
There are many reasons that LPIs are at increased risk for developmental delay and being less school ready. The late preterm brain is still developing and potentially susceptible to injury compared with FTIs,19,20 because so much of the brain growth is occurring under conditions differing from those in the womb. Events occurring at this time involve the development of neurons and glia with organizational events at the cellular and molecular level.21
Interestingly, the majority of LPIs improved their score ranking between 24 months and kindergarten, demonstrating a poor positive predictive value for the MDI of the BSF-R. The basis for this change in scoring could not be ascertained from this study. We speculate that those children who had poor test scores at 24 months might have become eligible for early intervention services, which may have helped them improve over time. Those that continued to do poorly between the 2 time points were more likely to be characterized as having impoverished households, a primary language other than English, lower maternal education, and black maternal race, underscoring the strong effects of lower socioeconomic status on children’s development and their risk for worse development independent of gestational age, an effect that is well documented.22–27 Non-English-speaking families also tend to have reduced access to and utilization of high-quality pediatric health care, which identifies and helps with the effects of adverse physical and environmental exposures.28,29
Our results expand on the current literature in several ways. With this cohort, we were able to follow participants from infancy to school age and find deficiencies in LPIs compared with FTIs at 2 time points. Although there are higher odds of having worse school readiness when testing poorly at 24 months (MDI <70 as an independent risk factor for having poor school readiness), testing at 24 months is not a good predictor for school-age outcomes: the majority of those with scores <70 tested in the >5% range at kindergarten. Having an MDI >70 at 24 months, however, is a better predictor of having a TSRS >5%.
Although late preterm children have a greater risk for poorer performance at kindergarten compared with those who were full term, our results suggest that prediction of those who will not score well using the 24 month scores is poor. A study30 using the ECLS-B sample found that the patterns of development between 24 and 48 months were highly dynamic, with socioeconomic factors being very important predictors of development. Likewise, Romeo et al31 indicated a similar pattern in which LPIs had lower MDI scores compared with FTIs on the Bayley Scales of Infant Development at 12 and 18 months using uncorrected age. However, by 5 years of age, LPIs had IQs within the normal range.31,32
These results are consistent with other published literature in which LPIs are generally, but not always, found to be at a disadvantage for school. LPIs have been noted to have increased rates of learning difficulties at school age and require increased rates of special education.33 They also have increased risk of developmental delay and school suspension.33 LPIs compared with FTIs have increased odds of failure on first grade standardized testing including math, reading, and English language arts.34 Moderately preterm infants and LPIs have cognitive and emotional regulation difficulties that affect their functioning at school age, including slightly lower IQ scores and an increased incidence of attention, behavioral, and school problems compared with full-term children.35 They have also been found not to reach a good level of overall achievement in early school by teacher assessments using the Foundation Stage Profile.36
In contrast to reports of increased difficulties, 1 group out of Northern Ireland found equal testing scores for cognitive, language, and motor abilities between LPIs that required intensive care and those that did not, without FTI controls, in a homogeneous population.37 Gurka et al reported no differences between LPIs and FTIs in cognitive, achievement, behavioral/emotional, or social disability at 15 years of age.38 Likewise, a United States–based group found similar rates of learning disability and attention deficit/hyperactivity disorder diagnoses in LPIs and FTIs in a white, middle class community without ethnic and racial diversity.39
Our study cohort has the strengths of being a large nationally representative sample with robust follow-up at kindergarten and data collected from multiple sources including parents, birth certificates, and direct in-home assessments of children. The assessments were conducted by trained assessors with excellent quality control measures. Other measures were graded by a central grading institute, reducing the likelihood of bias.
Our study also has some potential limitations. First, we used the bottom 5% of testing at kindergarten because it is consistent with educational usage denoting substantial impairment. However, children at the 10% or 25% levels may also experience school difficulties. Using a less stringent definition of impairment would have increased the number of LPIs that met criteria for educational problems, but it is unclear whether children at these cutoffs would indeed have had such problems. Second, loss to follow-up may have influenced the results. However, the children who were not assessed at kindergarten were mostly FTIs born to educated, married, white mothers who scored in the normal range or higher on their 24-month testing, thus biasing the results toward a larger difference between LPIs and FTIs. Third, this was a secondary data analysis on a precollected database, so we were unable to assess severity of illness at birth, duration of neonatal hospitalization, neonatal morbidities, emergent versus elective deliveries, or usage of early intervention to further risk-stratify the population. Fourth, the testing performed has not been widely used in other research studies. Still, the questions were adapted from multiple well-known testing sources and validated before use, including field tests. Field staff was extensively trained to achieve reliable, standardized test administration. Because all children were tested with the same materials, the relative comparisons are still compelling. Last, the BSF-R was created for the ECLS-B and may not truly reflect a true MDI score from the BSID-II.
LPIs across multiple studies have more school-related challenges compared with FTIs. Because neurodevelopment is used as a measure of outcomes for different therapies as well as for parents to help guide decisions, it would be useful to have a measure that is precise and can predict long-term outcomes. Several authors have found poor predictability of MDI on longer-term outcomes.40,41 Without a reliable measure, developmental surveillance should be a priority for health care professionals involved with infants and children, as they are in a critical position to identify delays in development and facilitate intervention. These results also call for a prospective cohort study to better define risk factors for future school failure.
APPENDIX 1 Congenital Anomalies From Birth Certificate Data
Congenital anomalies from birth certificate data were anencephaly, spina bifida, hydrocephalus, microcephalus, other central nervous system anomaly, heart malformation, other circulatory/respiratory disorder, rectal atresia/stenosis, tracheo-esophageal fistula, omphalocele, gastroschesis, other gastrointestinal anomaly, malformed genitalia, renal agenesis, other urogenital anomaly, cleft lip/palate, plydactyly, syndactyly, adactyly, clubfoot, diaphragmatic hernia, other musculoskeletal anomaly, Down syndrome, other chromosomal anomaly, and other diagnosis without category.
APPENDIX 2 Languages Spoken
Languages spoken were English, Arabic, Chinese, Filipino, French, German, Greek, Italian, Japanese, Korean, Polish, Portuguese, Spanish, Vietnamese, African, East European, Native American, Sign Language, Middle Eastern, West European, Indian Subcontinent, Southeast Asian, Pacific Island, “cannot choose,” and “some other language (specify).”
- Accepted June 15, 2015.
- Address correspondence to Melissa Woythaler, Massachusetts General Hospital, Department of Newborn Medicine, 55 Fruit St, Founders 5, Boston, MA 02114. E-mail:
Dr Woythaler conceptualized and designed the study, conducted the analyses, and drafted the initial manuscript; Drs McCormick and Smith helped design the study and reviewed and revised the manuscript; Ms Mao helped design and conduct the statistical analysis; and all authors approved the final manuscript as submitted.
FINANCIAL DISCLOSURE: The authors have indicated they have no financial relationships relevant to this article to disclose.
FUNDING: No external funding.
POTENTIAL CONFLICT OF INTEREST: The authors have indicated they have no potential conflicts of interest to disclose.
- ↵National Center for Health Statistics. Births: preliminary data for 2008. Centers for Disease Control and Prevention. Available at: www.cdc.gov/nchs/data/nvsr/nvsr58/nvsr58_16.pdf. Accessed January 16, 2013
- ↵Celik IH, Demirel G, Canpolat FE, Dilmen U. A common problem for neonatal intensive care units: late preterm infants, a prospective study with term controls in a large perinatal center. J Matern Fetal Neonatal Med. 2013;26(5):459–462
- McCall E,
- Craig S,
- Neonatal Intensive Care Outcomes Research and Evaluation (NICORE) Group
- Raju TN,
- Higgins RD,
- Stark AR,
- Leveno KJ
- Huddy CL,
- Johnson A,
- Hope PL
- ↵Lipkind HS, Slopen ME, Pfeiffer MR, et al. School-age outcomes of late preterm infants in New York City. Am J Obstet Gynecol. 2012;206(3):222.e1–222.e6.
- Snow K,
- Derecho A,
- Wheeless S,
- et al
- National Center for Education Statistics
- ↵Kinney H. The near-term (late preterm) human brain and risk for periventricular leukomalacia: a review. Semin Perinatal. 2006;30(2);81–88
- Kumanyika S and Grier S. Targeting interventions for ethnic minority and low-income populations. Future Child. 2006;16(1);187–207
- Duke NK
- Blendon RJ,
- Buhr T,
- Cassidy EF,
- et al
- Quigley MA,
- Poulsen G,
- Boyle E,
- et al
- Roberts G,
- Anderson PJ,
- Doyle LW,
- Victorian Infant Collaborative Study Group
- Copyright © 2015 by the American Academy of Pediatrics