OBJECTIVES: Using a large, racially diverse US dataset, we aimed primarily to: (1) fit and validate sex-specific birth weight and head circumference for gestational age charts for infants born at 22 to 29 weeks’ gestation; and (2) fit race-specific birth weight and head circumference for gestational age charts.
METHODS: We used data collected between 2006 and 2014 on 183 243 singleton infants without congenital malformations with gestational age between 22 weeks, 0 days and 29 weeks, 6 days from 852 US members of the Vermont Oxford Network. For the sex-specific charts, the final sample size included 156 587 infants who survived hospital discharge. From these 156 587, we abstracted a subset of 47 005 infants to fit sex-specific charts separately for white, black, and Asian infants. For all charts, we applied quantile regression models to predict infants’ birth weight and head circumference percentiles from gestational age expressed in days.
RESULTS: We successfully validated the overall sex-specific charts. Over most of the gestational age range, black infants, either girls or boys, had the lowest predicted birth weight as compared with white and Asian infants for many percentiles.
CONCLUSIONS: We fitted and validated new sex-specific charts using a recent, large, and racially diverse dataset. Future steps include using these charts to examine associations of weight and head circumference at birth with mortality and morbidity.
- BW —
- birth weight
- CI —
- confidence interval
- GA —
- gestational age
- GAMLSS —
- generalized additive models for location, scale, and shape
- HC —
- head circumference
- IUGR —
- intrauterine growth restriction
- LGA —
- large for gestational age
- LMP —
- last menstrual period
- NICHD —
- National Institute of Child Health and Human Development
- QR —
- quantile regression
- SGA —
- small for gestational age
- VON —
- Vermont Oxford Network
What’s Known on This Subject:
Birth weight and head circumference for gestational age charts for preterm infants, particularly those born at the lower end of extreme prematurity, have been limited by small study samples. No recent race-specific charts are available for preterm infants.
What This Study Adds:
The new sex- and race-specific birth weight and head circumference charts for infants 22 weeks, 0 days and 29 weeks, 6 days gestation and fitted using a recent, large US dataset provide a reference for assessing size at birth.
Preterm infants, particularly those born at lower gestational ages, have high risks of mortality, morbidities, and neurodevelopmental impairment.1,2 Growth restriction, usually defined as small for gestational age (SGA), additionally raises these increased risks among preterm infants,3,4 making it important to accurately identify SGA infants. Anthropometric charts assessing birth weight (BW) for gestational age (GA) have been published to improve on previous charts.5–7 However, there are no recent charts for infants born at 22 weeks’ gestation, and recent charts for 23-week infants are based on small sample sizes.7 Additionally, although the argument has been made that sex-specific charts are needed given birth-size sex differences,7 not enough infants by race were available to develop race-specific charts. On average, BW racial differences are as large as sex differences; girl infants are 95 g lighter than boy infants, and white and Hispanic infants are 90 g larger than black infants.8
Our primary aim was to fit and validate up-to-date sex-specific charts to assess weight and head circumference (HC) at birth for GA among infants born 22 to 29 weeks’ gestation. Our secondary aim was to fit race-specific charts to assess BW and HC for GA using a large, current, racially diverse US dataset. We also compared our charts to the currently used charts, specifically the Olsen charts.7
We used cross-sectional data collected by 852 centers with NICUs located in the United States or Puerto Rico and participating in the Vermont Oxford Network (VON) Very Low Birth Weight Database between January 1, 2006 and December 31, 2014. Eligible infants had a GA between 22 weeks, 0 days and 29 weeks, 6 days or a BW 401 to 1500 g. The committee for human research at the University of Vermont approved use of the de-identified VON research repository for this analysis.
Definitions of Study Variables
GA in weeks and days as defined in the VON manual of operations was determined using obstetrical measures based on last menstrual period (LMP) and prenatal ultrasound in the maternal chart or, if unavailable, a neonatologist’s estimate based on postnatal physical examinations.9 BW was recorded from labor and delivery or, if unavailable, weight on admission to the neonatal unit was used.9 HC was recorded on the day of birth or, if unavailable, the first HC measurement the day after was used.9 Race/ethnicity was recorded primarily as selected by the mother and secondarily by birth certificate or medical record review according to federally funded study guidelines. This definition was adopted in 2012 and, as such, race-specific charts are based on years 2012 to 2014. Prenatal care indicator was defined if the mother received any prenatal obstetrical care before admission.9
Centers self-reported their NICU level of care.10,11 Hospital regions were grouped as: New England, Middle Atlantic, East North Central, West North Central, South Atlantic, East South Central, West South Central, Mountain, Pacific, and Puerto Rico.9
Small for gestational age (SGA) and large for gestational age (LGA) were defined as birth <10th or >90th percentile, respectively, of sex-specific BW for GA.
Implausible BW and HC values were defined as values below or above given cutoffs. These were determined for different GAs using 3 alternative methods: (1) Tukey’s rule,12 (2) a rule based on 4 SDs from the median, and (3) a rule based on data sparsity.13 Rules 1 and 2 produced particularly low, sometimes negative, cutoffs for BW. Thus, rule 3 was preferred.
Our study sample was restricted to inborn, singleton infants with GAs between 22 weeks 0 days and 29 weeks 6 days inclusive with no congenital malformations (n = 183 243). We excluded infants with unknown gender (30), missing (71) or implausible (744) BW, missing (24 706) or implausible (860) HC, missing hospital length-of-stay (78), or who were hospitalized >1 year (565). Overall, 26 656 (14.5%) infants were excluded, leaving 156 587 infants for our main analyses. For comparing our charts with the Olsen charts,7 we similarly excluded infants who died before hospital discharge (22 834, 12.5%), resulting in 133 753 infants. As shown in Supplemental Table 4, infants who died were more likely to be growth-restricted in-utero than those surviving hospital discharge. All analyses described below were stratified by sex.
For our primary aim, 60% and 40% of the observations in the selected sample (n = 156 587) were randomly assigned to, respectively, a training or a validation dataset. The former provided data to fit the growth charts and assess their goodness of fit; the latter provided data, independent from the training dataset, to select the final model among the best fitting models. The same process was repeated for the sample (n = 133 753) that excluded deaths for comparison with the Olsen charts.7 For our secondary aim, the selected sample (n = 156 587) was restricted to infants with available information on race/ethnicity (n = 53 087). Race-specific charts were fitted only for groups with an adequate sample size. Therefore, we confined our analysis to white (50.3%), black (34.3%), and Asian (3.9%) infants; Hispanics constituted only 12.5%. In contrast, we did not use data on 6082 (11.5%) infants, of which 0.93% classified as American Indian or Alaska Native, 0.48% as Native Hawaiian or other Pacific Islander, and 10.0% as other race. Finally, we tested whether BW for GA differed significantly by race using a likelihood ratio test.14
We assumed that BW and HC percentiles vary smoothly with age and investigated 2 alternative modeling approaches: one based on generalized additive models for location, scale, and shape (GAMLSS),15 which includes the LMS model16 as a special case; the other based on quantile regression (QR).17 Nine percentiles were included in the analysis: third, fifth, 10th, 25th, 50th, 75th, 90th, 95th, and 97th.
We considered the following GAMLSS: (model 1) the LMS model16 with 3 parameters, 1 accounting for skewness; (model 2) the power exponential model15 with 4 parameters, 2 accounting for skewness and kurtosis; and the following QR models: (model 3) QR with smoothing B-splines18 and (model 4) transformation-based QR.19
Models 1, 2, and 3 were fitted using different levels of smoothing. Model 4 was fitted using different transformation families (Box-Cox, Aranda-Ordaz, and Geraci-Jones). During this stage, we identified the 2 best-fitting models, one among GAMLSS and one among QR models, by evaluating the goodness of fit tests,14,20 worm plots,21 Shapiro-Wilk test of normality of the z-scores, and the mean squared differences between predicted (from the model) and conditional sample percentiles. Models 1 and 4 (with Geraci-Jones transformation) outperformed all the others when fitting BW for GA curves, either for girls or boys. Similarly, models 1 and 4 (with Aranda-Ordaz transformation) performed best when fitting curves for HC.
In the validation step, the performance of models 1 and 4 was compared using the validation dataset by the criteria described above (worm plots, mean squared differences, z-scores) and additional criteria as follows. Fitted percentile curves for either BW or HC were used to calculate SGA and LGA rates in the validation dataset, overall and by GA intervals (22–25, 26–27, and 28–29 weeks). For the latter, 100 × (1 − 0.05/3)% = 98.3% confidence intervals (CIs) were also calculated.
We conducted an analysis to assess the sensitivity of the final fitted percentile curves to specific exclusion criteria (ie, exclusion of implausible values of BW and HC and of infants with missing values) by using multiple imputation by chained equations.22 The matrix feeding into the imputation algorithm consisted of BW, HC, GA, Apgar scores at 1 and 5 minutes, hospital level of care, NICU type, region, and prenatal care indicator.
The final sample size (n = 156 587) consisted of 93 951 observations for the training dataset and 62 636 observations for the validation dataset. Race-specific charts included data on 47 005 infants. Table 1 shows overall means and SDs of BW and HC and GA-specific sample sizes by sex as well as by sex and race.
Model 4 was selected as the final model to produce the charts for US girls and boys. BW and HC for GA-predicted percentiles and charts for girls and boys are presented in Fig 1 and Table 2 (percentiles and charts for model 1 are shown in Supplemental Table 5 and Supplemental Fig 5).
Figure 2 shows SGA and LGA rates by GA interval and 98.3% CIs as calculated from the validation sample. The expected proportion of SGA and LGA is 10% in both cases. All SGA rates were close to 10%, or their CIs included the expected proportion. This was also true for LGA rates, except for girls at 26 to 27 weeks’ gestation showing an LGA rate above the expected 10% (Fig 2). These results were not sensitive to imputation of missing and implausible values (data not shown).
Figure 3 compares our BW for GA charts that excluded deaths (predicted percentiles and charts for girls and boys excluding deaths are provided in Supplemental Fig 6 and Supplemental Table 6) with the Olsen charts.7 In general, the Olsen percentiles were shifted upwards for both girls and boys except at 23 weeks for the 10th percentile, where the 2 charts overlapped. This is also shown in Table 3 using the validation data with both sexes combined, where the Olsen SGA rates were higher than our SGA rates by 4.6%, 6.5%, and 5.9% for the 3 groups of 22 to 25, 26 to 27, and 28 to 29 weeks’ gestation, respectively. Comparison with our charts that did not exclude deaths additionally shifted our percentiles below Olsen’s (Supplemental Fig 7).
Figure 4 compares race-specific predicted percentiles of BW for GA. For most GAs and percentiles, white and Asian infants had higher predicted BW than black infants for either girls or boys. BW curves for white and Asian infants overlapped or crossed at several GAs. BW for GA in girls differed by race at the 95th (P = .007) and 97th (P < .001) percentiles. For boys, there were differences at the third (P = .030), 25th (P < .001), 75th (P = .007), and 97th (P value = .030) percentiles (data not shown). Supplemental Tables 7 and 8 show race-specific percentiles of BW and HC for GA in girls and boys, respectively, and charts of predicted race-specific percentiles are displayed in Supplemental Figs 8, 9, 10, and 11.
We defined new BW and HC for GA sex-specific charts for infants at 22 to 29 weeks’ gestation using data for >156 000 infants from the VON database. In addition, we developed sex- and race-specific charts for white, black, and Asian infants. Over most of the GA range, black infants had the lowest predicted BW as compared with white and Asian infants for many percentiles. In contrast, BW predictions for Asian and white infants overlapped frequently across different GAs. Our SGA rates were still lower after excluding deaths than those calculated using Olsen’s charts for all GAs except at 23 weeks, in which case our rates and Olsen’s rates overlapped. In 2014, infants born between 22 and 29 weeks’ gestation based on LMP estimates constituted only 1% (41 672) of all US births.25 Data collected by VON on these infants represented 88% of all 22- to 29-week US births. Our sample is representative of the US racial distribution of births. Using national estimates to examine GA distribution based on LMP, black infants are overrepresented at these lower GAs (black infants at 22–29 weeks, 31.7% [US] vs 34.3% [VON]) compared with their overall representation at all GAs among US births (black infants overall, 16%).25
Determining the inclusion and exclusion criteria is an important step in creating weight for GA charts. Two types of charts exist: reference and standard charts. Standard charts are prescriptive, define how a population should grow under optimal environmental and health conditions, and are based on low-risk pregnancies. Reference charts are descriptive, include both low-risk and high-risk pregnancies, and indicate growth in a particular place and time.26,27 There are numerous BW for GA reference charts (at least 26) comparing growth of newborns against the general population.28 Our proposed charts are reference charts. However, BW for GA charts for preterm infants differ from ultrasound-based fetal weight charts. Preterm infants are smaller than in-utero infants given the pathologic conditions that resulted in their preterm birth.29–31 As such, charts for preterm infants might still underestimate the true intrauterine growth restriction (IUGR) prevalence.32 Despite their limitations, these charts are widely used in hospitals to provide a reference for expected BW at different GAs.
Two groups recently published fetal growth standards using ultrasound to measure fetal anthropometry. The INTERGROWTH-21st Project recruited low-risk pregnant women from countries with a diverse population mix, pooling ultrasonographic measurements to construct a single fetal growth standard assuming no differences between the populations.33,34 The INTERGROWTH-21st group also published sex-specific BW for GA standards at 33 to 42 weeks’ gestation, because too few women gave birth before 33 weeks given the low-medium risk group of recruited pregnant women.35 In the United States, on the other hand, the National Institute of Child Health and Human Development (NICHD) fetal growth study followed low-risk women and defined race/ethnic-specific charts arguing for significant differences in fetal growth among non-Hispanic white, non-Hispanic black, Hispanic, and Asian infants.36 In our study, we used a more recent federal definition of race37 to examine differences between white, black, and Asian infants. We found some differences in the predicted percentiles of BW for GA among the examined groups. How these differences translate to associations with outcomes should be examined in future studies.
Of particular relevance to preterm infants is SGA status assessment, a proxy for IUGR. Serial fetal ultrasonography is the gold standard for IUGR diagnosis but SGA status is often used in settings where sonographic assessment of intrauterine growth is not readily available. Infants born both preterm and SGA have the highest risk of adverse outcomes; therefore, it is essential to identify them for secondary and tertiary prevention of mortality and adverse outcomes.38,39 In a study using VON data, SGA status (using the 1993 US Center for Health Statistics natality database) among infants born at 25 to 30 weeks' gestation was associated with increased odds of neonatal mortality, necrotizing enterocolitis, and respiratory distress syndrome.3 In another study using NICHD data, SGA status (using Olsen’s norms) among infants <27 weeks was associated with increased odds of mortality, postnatal growth failure, and the combined outcome of death or neurodevelopmental impairment.4
We compared our charts to Olsen’s because the latter are relatively recent and commonly cited in US studies of extremely preterm infants.4,40 Olsen’s charts used data from the Pediatrix database (1998–2006) after excluding in-hospital deaths to generate sex-specific charts for infants 23 to 41 weeks’ gestation (n = 11 377 for infants 23–29 weeks’ gestation) admitted to the NICU.7 Overall, our sex-specific charts predicted lower SGA rates compared with Olsen’s even after excluding deaths. Examining the percentiles of infants who died shows that they are more likely to be growth restricted than infants who survived. To fit and validate our models, we used GA in number of days instead of rounding down to the largest previous week as is commonly done. Our approach, of course, preserves more information in the data. Indeed, as also outlined in the NICHD workshop on periviable births, the division between one week and the next is arbitrary because it does not represent continuous growth and development.41 Rounding GAs to whole weeks implies that 2 infants, of which one is assigned to any given week and the other assigned to the week after, can differ in GA by at least 1 day or by as much as 13 days.41 Using number of days rather than whole weeks is particularly important for infants born at 22, 23, or 24 weeks’ gestation because each additional day of gestation can have a meaningful impact on clinical outcomes and on the decision to initiate active treatment in hospitals.42
Yet, preliminary analyses on our data showed that the proportion of infants born at GAs equal to multiples of 7 was higher than expected by chance, probably as a result of rounding during data collection at participating hospitals. This partly explains the poorer performance of prediction methods based on local smoothing, including the LMS approach, which produced curves with a “bumpy” appearance. We therefore reexamined our data: first rounding all observations in the training dataset down to the nearest week, then predicting BW for GA percentiles using the LMS approach, and finally calculating SGA rates using the validation dataset. The newly developed LMS curves appeared more regular (less “bumpy”) but now shifted upward because rounding down places heavier infants in the same GA group of infants with a smaller number of gestational days. Indeed, SGA rates calculated from the LMS charts fitted on rounded GAs were closer to those obtained from the Olsen charts and, in general, much above 10%.
Although our charts are based on a large sample representative of the entire US 22 to 29 weeks’ gestation preterm population, our study has some limitations. We had no data on birth length for GA and on syndromic conditions that impair fetal growth. And although the GA definition is standardized across the VON sites, we had no information on which method was used for GA assessment for each infant. Early ultrasound (accuracy ±5 to ±7 days based on trimester) ideally in the first trimester is the gold standard for GA assessment.43 GA assessment based on LMP date has lower accuracy (±14 days) given varying cycle lengths in women, ovulation/conception timing, and recall error.44 The accuracy of newborn examination (±13 days for Dubowitz examination) depends on the complexity of the score used and the examiner’s skill level.43 More accurate methods/biomarkers to assess GA, fetal growth, and maturity, particularly among extremely preterm infants, are needed. Interpreting the significance of the differences in the racial charts requires caution. Race is a complex construct with phenotypic heterogeneity.45 However, this definition is used for self-identification, census, and research purposes.37
We developed and validated BW and HC for GA sex-specific charts for infants born at 22 to 29 weeks’ (or 154–203 days’) gestation. Although we excluded deaths only for the purpose of comparing our charts with Olsen’s, we recommend using the charts that included deaths when assessing BW and HC at birth because survival status is unknown at that point in time. We will use this large VON dataset to assess whether a sex-specific or a race-specific SGA definition better predicts neonatal mortality and morbidities. Findings from subsequent studies should be informative in developing predictive models for outcomes, especially for infants at the periviable gestation of 22 to 25 weeks.41
We thank our medical and nursing colleagues and the infants and their parents who agreed to take part in this study. Participating centers are listed in Supplemental Table 9.
- Accepted September 20, 2016.
- Address correspondence to Nansi S. Boghossian, PhD, MPH, Department of Epidemiology and Biostatistics, Arnold School of Public Health, University of South Carolina, Columbia, SC, 29208. E-mail:
FINANCIAL DISCLOSURE: Dr. Horbar and Ms Morrow are employees of Vermont Oxford Network and Dr. Edwards receives salary support from Vermont Oxford Network. The other authors have indicated they have no financial relationships relevant to this article to disclose.
FUNDING: No external funding.
POTENTIAL CONFLICT OF INTEREST: The authors have indicated they have no potential conflicts of interest to disclose.
COMPANION PAPER: A companion to this article can be found online at www.pediatrics.org/cgi/doi/10.1542/peds.2016-3128.
- Horbar JD,
- Carpenter JH,
- Badger GJ, et al
- Stoll BJ,
- Hansen NI,
- Bell EF, et al; Eunice Kennedy Shriver National Institute of Child Health and Human Development Neonatal Research Network
- De Jesus LC,
- Pappas A,
- Shankaran S, et al; Eunice Kennedy Shriver National Institute of Health and Human Development Neonatal Research Network
- Vermont Oxford Network
- American Academy of Pediatrics Committee on Fetus And Newborn
- Stark AR; American Academy of Pediatrics Committee on Fetus and Newborn
- Joseph KS,
- Kramer MS,
- Allen AC,
- Mery LS,
- Platt RW,
- Wen SW
- Koenker R
- Geraci M,
- Jones MC
- van Buuren S,
- Groothuis-Oudshoorn K
- R Foundation
- Geraci M
- Centers for Disease Control and Prevention
- Cooke RW
- Ehrenkranz RA
- Papageorghiou AT,
- Ohuma EO,
- Altman DG, et al; International Fetal and Newborn Growth Consortium for the 21st Century (INTERGROWTH-21st)
- Villar J,
- Papageorghiou AT,
- Pang R, et al; International Fetal and Newborn Growth Consortium for the 21st Century (INTERGROWTH-21st)
- Villar J,
- Cheikh Ismail L,
- Victora CG, et al; International Fetal and Newborn Growth Consortium for the 21st Century (INTERGROWTH-21st)
- Humes KR,
- Jones NA,
- Ramirez RR
- Kozuki N,
- Katz J,
- Christian P, et al; Child Health Epidemiology Reference Group Preterm Birth–SGA Working Group
- Boghossian NS,
- Hansen NI,
- Bell EF, et al; Eunice Kennedy Shriver National Institute of Child Health and Human Development Neonatal Research Network
- Raju TN,
- Mercer BM,
- Burchfield DJ,
- Joseph GF Jr
- Kaufman JS,
- Cooper RS
- Copyright © 2016 by the American Academy of Pediatrics