# Estimating Overweight Risk in Childhood From Predictors During Infancy

## Abstract

**OBJECTIVE:** The aim of this study was to develop and validate a risk score algorithm for childhood overweight based on a prediction model in infants.

**METHODS:** Analysis was conducted by using the UK Millennium Cohort Study. The cohort was divided randomly by using 80% of the sample for derivation of the risk algorithm and 20% of the sample for validation. Stepwise logistic regression determined a prediction model for childhood overweight at 3 years defined by the International Obesity Task Force criteria. Predictive metrics *R*^{2}, area under the receiver operating curve (AUROC), sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated.

**RESULTS:** Seven predictors were found to be significantly associated with overweight at 3 years in a mutually adjusted predictor model: gender, birth weight, weight gain, maternal prepregnancy BMI, paternal BMI, maternal smoking in pregnancy, and breastfeeding status. Risk scores ranged from 0 to 59 corresponding to a predicted risk from 4.1% to 73.8%. The model revealed moderately good predictive ability in both the derivation cohort (*R*^{2} = 0.92, AUROC = 0.721, sensitivity = 0.699, specificity = 0.679, PPV = 38%, NPV = 87%) and validation cohort (*R*^{2} = 0.84, AUROC = 0.755, sensitivity = 0.769, specificity = 0.665, PPV = 37%, NPV = 89%).

**CONCLUSIONS:** Using a prediction algorithm to identify at-risk infants could reduce levels of child overweight and obesity by enabling health professionals to target prevention more effectively. Further research needs to evaluate the clinical validity, feasibility, and acceptability of communicating this risk.

- 95% CI —
- 95% confidence interval
- AUROC —
- area under the receiver operating curve
- HCP —
- health care professional
- IOTF —
- International Obesity Task Force
- MCS —
- Millennium Cohort Study
- NPV —
- negative predictive value
- OR —
- odds ratio
- PPV —
- positive predictive value

#### What’s Known on This Subject:

Several risk factors for both overweight and obesity in childhood are identifiable during infancy.

#### What This Study Adds:

A simple risk algorithm can be used to quantify risk of overweight in children. It can be used to help identify at-risk infants in a clinical setting to facilitate targeted intervention.

In the United Kingdom in 2010, ∼3 in 10 boys and girls (aged 2 to 15) were classed as either overweight or obese.^{1} Rapid weight gain during infancy is associated with obesity between 6 and 8 years of age^{2}^{–}^{4} and later life,^{5}^{–}^{7} and although estimates vary, between 25%^{8} and 33%^{2} of infants gain weight rapidly during the first 6 months after birth. There is evidence that weight at 5 years of age is a good indicator of the future health of a child^{9} and that obesity during childhood increases the risk of adult obesity. This has a clearly measurable impact on physical and mental health, quality of life, and generates considerable direct and indirect costs.^{10} Thus, there is a compelling rationale for identifying those infants at greatest risk.

UK health policy suggests primary prevention and evidence-based interventions are important.^{11}^{,}^{12} However, there is little guidance for health care professionals (HCPs) to support identification of infants at risk for developing childhood obesity. In the United Kingdom, health visitors and their team members deliver the Healthy Child Program^{13} to parents of children younger than 5 years old. Studies have revealed that members of the health visiting team lacked guidance around identifying and intervening with infants who gain weight rapidly^{14} and have low levels of knowledge about obesity risk.^{15} The US Institute of Medicine has introduced early childhood obesity prevention guidance^{16} suggesting that HCPs should undertake regular growth monitoring and consider obesity risk factors during infancy. A recent systematic review^{17} has identified early-life risk factors of overweight in childhood thus offering the potential to develop a useful tool to identify infants at risk for obesity. Therefore, the aim of this study was to develop and validate a risk score algorithm for overweight in childhood based on predictors identified in the first year by using a large and contemporary British birth cohort.

## Methods

### Participants

The Millennium Cohort Study (MCS) is a contemporary prospective birth cohort in the United Kingdom. Full details of the data collection and sampling design are provided elsewhere.^{18} The study cohort analyzed data from 18 296 singleton infants aged 6 months to 12 months at the first interview. Preterm infants, multiple births, infants with congenital malformations, and specific medical conditions (diabetes, renal disease) were excluded from the analysis because these children have potentially different growth trajectories. The mean age of infants at the first interview was 9.2 months (SD 0.53). Children at follow-up (second interview) ranged from 31.9 months to 51.8 months of age with a mean age of 37.7 months (SD 2.5). The analysis was restricted to 13 513 singleton children who had complete anthropometric data at follow-up. The sample was divided into 2 cohorts: 80% of the sample was randomly selected to a derivation cohort for the development of the risk algorithm while the remaining 20% was used to validate the risk algorithm.

### Outcome Measure

The primary clinical outcome for childhood overweight at 3 years was defined by the International Obesity Task Force (IOTF)^{19} gender and age-specific cutoffs corresponding to an adult BMI ≥25 kg/m^{2} (girls: ≥18.02 kg/m^{2}; boys: ≥18.41 kg/m^{2}). The outcome at 3 years was chosen as it was the mean age at follow-up where a diagnosis of overweight in early childhood could be made (there is no standard definition of overweight in children younger than 2 years of age).^{19}

### Risk Factors

Predictors in early-life were based on questions obtained from the first parent interview when infants were between 6 and 12 months. Predictors were selected based on a comprehensive systematic review^{17} conducted by the research team on infant risk factors of overweight in childhood. Risk factors that were identified in Weng et al^{17} as significantly associated with overweight in childhood were considered a priori. In total, 33 potential predictor variables were investigated across several categories (Supplemental Table 6). The majority of variables were presented in the MCS as categorical. However, several variables were dichotomized or categorized for logistic regression. Infant birth weight was categorized in quintiles, and rapid weight gain was defined as weight gain >0.67 SD change in weight-for-age *z* score in the infant’s first year. This definition of rapid weight gain has been commonly used in other studies^{6}^{,}^{20}^{,}^{21} and can be interpreted as crossing centile lines on a growth chart. Maternal prepregnancy BMI and paternal BMI were classified in categories: <18.5 kg/m^{2}; 18.5 to <25 kg/m^{2}; 25 to <30 kg/m^{2}; or ≥30 kg/m^{2}.

### Statistical Analysis

Univariate logistic regression was used to test the significance between potential predictor variables and overweight in childhood. The likelihood test was used to consider the significance of individual predictor variables due to the categorical nature of the variables. Variables were considered statistically significant if likelihood *P* values were <5%. Predictor variables that were significant in the univariate analyses were included in a mutually adjusted model. Stepwise regression analysis, which can optimize the model by maximizing independence among predictors, was used to determine the best predictor model for overweight at 3 years.

### Derivation and Validation of the Risk Algorithm

The risk prediction algorithm was developed and validated by using methods established in previous studies.^{22}^{–}^{25} Using the mutually adjusted predictor model, we created an algorithm based on the relative strengths of β coefficients from logistic regression. A risk score was devised by assigning integer values to variable categories. All β coefficients in the adjusted model were divided by the β with the smallest value to obtain the relative strengths of each category. The value rounded to the nearest whole number was the assigned score. Reference categories were assigned an integer value of 0.

Once integer values were assigned to each of the variable categories, a total risk score was calculated for each individual within the derivation cohort. The total risk score was regressed against the overweight outcome by using logistic regression. The β coefficient from this regression analysis was used to derive the predicted risk of overweight by using the following function where *e* is the base of the natural logarithm, β is the regression coefficient, *X* is the total risk score, and *Y* is the regression constant:

The risk score algorithm was applied to all individuals within the validation cohort. The predictive capability of the risk score was evaluated by plotting the total risk score against observed and expected risk for both the derivation and validation cohorts. Observed risk was calculated as the true proportion of those considered overweight at follow-up corresponding to each risk score. Model-fit was assessed by *R*^{2} from the regression of observed risks against the predicted risk scores. Discrimination was evaluated by the area under the receiver operating curve (AUROC).^{24} Additional predictive metrics of sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were provided.

## Results

### Overall Study Population

At the 3-year follow-up from birth, 23.4% of all children in the derivation cohort were considered overweight. The mean BMI at follow-up was 16.85 kg/m^{2}. There were approximately equal representations of boys and girls in the entire sample. Deprived groups were overrepresented where 60% of children were from households that earned £20 800 ($33 292 USD) or less. Most children in the sample (83%) were from white ethnic backgrounds. Both the derivation and validation cohorts were similar in characteristics (Supplemental Table 7).

### Predictor Variables

The unadjusted regression analysis identified associations between 16 potential predictors and childhood overweight (Table 1 and Supplemental Table 8). Seven predictors were included in a multivariate model (Table 2). Girls were 15% (95% confidence interval [CI]: 1.02–1.29, *P* = .024) more likely to be overweight than boys. Infants in the highest quintile of birth weight (≥3.81 kg) were 63% (95% CI: 1.33–1.98, *P* < .001) more likely to be overweight than infants in the lowest quintile (<2.93 kg) of birth weight. During the first year after birth, infants who experienced rapid weight gain were 4.15 (95% CI: 3.64–4.73, *P* < .001) times more likely to be overweight than infants who had not experienced rapid weight gain. Children of mothers who were overweight (odds ratio [OR] = 2.98 [95% CI: 1.60–3.47], *P* < .001) or obese (OR = 2.35 [95% CI: 1.60–3.47], *P* = .001) before pregnancy were more likely to be overweight than children of mothers who were underweight. Children of fathers (OR = 1.98 [95% CI: 1.00–3.96], *P* = .053) who were obese were more likely to be overweight than children of fathers who were underweight. Children of mothers who smoked during pregnancy were 33% (95% CI: 1.15–1.55, *P* < .001) more likely to be overweight than children of mothers who had not smoked. Infants who were never breastfed in the first year were 25% (95% CI: 1.09–1.42, *P* = .001) more likely to be overweight than children who were breastfed.

### Risk Score Algorithm

The integer values of the risk algorithm are given in Table 3. According to integer values, the strongest risk factors were rapid weight gain, infant birth weight ≥3.81 kg, maternal prepregnancy BMI from 25 kg/m^{2} to 30 kg/m^{2}, maternal prepregnancy BMI ≥30 kg/m^{2}, and paternal BMI ≥30 kg/m^{2}. Other risk factors were assigned relatively smaller integer values. The total risk score ranged from a minimum of 0 to a maximum of 59 (interquartile range: 17–35). In Table 4, the predicted probability of risk of overweight was constrained from 4.1% to 73.8%. The risk scores were separated into quintiles corresponding to observed frequencies of predicted risks providing pragmatic risk categories (Table 4).

### Validation

Total risk scores were plotted against observed and predicted risks of overweight for both the derivation (Fig 1) and validation (Fig 2) cohorts. Observed risks trended well with predicted risks in both cohorts. When observed risk was regressed against the risk scores, high *R*^{2} values were seem in both the derivation (*R*^{2} = 0.92) and validation cohorts (*R*^{2} = 0.84) suggesting a good model-fit. The AUROC for the derivation and validation cohort was 0.721 and 0.755, respectively (Table 5). This means there was a 72% to 76% probability that the predicted risk score was higher in children diagnosed as overweight than in children who were not overweight.

Additional metrics of sensitivity, specificity, PPV, and NPV were provided in Table 5, which evaluated how well the algorithm predicted high risk infants (defined as infants who obtained a risk score ≥25 corresponding to the top 2 quintiles of predicted risks in Table 4). The sensitivity of the risk algorithm for predicting high risk infants was 0.699 for the derivation cohort and 0.769 for the validation cohort, whereas the specificity was 0.676 for the derivation cohort and 0.665 for the validation cohort. Using the study prevalence of overweight, the PPV for overweight at 3 years was 38% for the derivation cohort and 37% for the validation cohort. The NPV was 87% for the derivation cohort and 89% for the validation cohort.

## Discussion

The growing prevalence of childhood overweight has warranted exploration into risk prediction models to aid prevention strategies. Although a recent risk model^{26} derived risk equations to predict childhood obesity at birth with good statistical validity (AUROC: 0.7–0.85), the risk algorithm described in this study identifies children between 6 and 12 months at risk for overweight. It, therefore, is able to incorporate the effects of rapid weight gain, which is the strongest marker of overweight and obesity in childhood.^{6}^{,}^{7}^{,}^{27} Additionally, it may be more effective and acceptable to communicate overweight risk to parents of infants at 6 and 12 months when the rapid weight gain is manifested. A US study revealed that positive parental changes occurred when a physical marker is visible such as a diagnosis of childhood overweight or perceiving the child’s weight as a health problem.^{28} There are 2 other risk prediction models^{27}^{,}^{29} with moderately good levels of predictability (AUROC: 0.7–0.8), which both included weight gain during the first year to predict childhood overweight or obesity. However, a significant advantage of the current study is that the variables used in the model were based on a comprehensive systematic review^{17} by the research team and were identified as being strongly associated with overweight risk in childhood.

The overweight outcome in this study was defined by IOTF criteria where the prevalence of overweight at 3 years was 23.4%. This is similar to the UK national estimate (>85th percentile^{30} based on UK growth charts) where 22.6%^{1} of children aged 4 to 5 were overweight. Using IOTF criteria results in a higher PPV due to a more stringent definition (≈ > 90th percentile based on UK growth charts). Applying the algorithm in this study on the validation cohort of 1715 children would identify 686 infants who achieved a high risk score and would subsequently be given intervention. Assuming intervention was 100% effective, this would avert 253 cases (PPV = 37%) of childhood overweight while 433 children would be misclassified as high risk. For the 1029 children who did not achieve a high risk score, only 114 children would be misclassified as low risk and become overweight (NPV = 89%). This level of accuracy in the PPV may not be ideal but may nevertheless yield preventive benefits.

First, early identification should serve to enhance the effectiveness of obesity interventions by targeting “at risk” children from a young age.^{31} The American Academy of Pediatrics has suggested that identification and referral for treatment during early childhood yields greater success in treatment.^{31} Observational evidence has shown that younger age of the child during parental lifestyle interventions is significantly associated with better long-term outcomes compared with older children.^{32} Second, both parents and HCPs underestimate obesity risk indicating that identification may be useful. Studies^{33}^{–}^{36} have consistently revealed that the majority of parents were not aware of their child’s obesity risk; however, this was not usually due to the inability of parents to identify the weight status of their children but rather their perception of what was considered healthy weight. Studies have also revealed that clinicians only diagnose overweight or obesity in 1.1% to 31% of all overweight children, leading to suboptimal levels of advice given and referrals to appropriate interventions.^{37}^{–}^{39} HCPs are wary of approaching clients about overweight and obesity, and they may be reluctant to identify risk because of the impact on the client-professional relationship.^{15} Considering rates of overweight or obesity diagnoses are suboptimal in current practice, a risk assessment tool even with moderate sensitivity and specificity would identify children who would have otherwise been missed. The proposed model would accurately exclude a number of children who would not become overweight from targeted intervention due to its high NPV. This could potentially maximize resource allocation to prioritize infants at the greatest risk. However, the negative consequences of misidentification such as the adverse effects of potentially stigmatizing parents need to be evaluated. Third, the benefits of intervention may outweigh the risks of intervening unnecessarily. Although the risks of intervening unnecessarily need to be considered, this may actually be overstated. Recent studies^{28}^{,}^{40} have revealed that counseling on weight status at an early age is significantly associated with encouraging positive parental lifestyle change. Additionally, the postnatal interventions that are recommended to reduce obesity risk should have very few deleterious effects as they focus on parental support, nutritional modification, healthy eating, and breastfeeding.^{41}^{–}^{44} Finally, this study has revealed the importance of the prenatal and preconception environment. High prepregnancy BMI is linked to intrauterine exposures of early overnutrition and programming, which may have a lasting influence by determining body composition.^{45}^{,}^{46} Contradictorily, maternal smoking in pregnancy is associated with in utero growth restriction but also increase later risk of childhood obesity.^{17}^{,}^{47} It is suggested that infants of mothers who had smoked in pregnancy often exhibit high rapid postnatal weight gain.^{48}^{,}^{49} Smoking in pregnancy may also be a proxy for other social and lifestyle characteristics including poor dietary choices and socioeconomic status.^{50} It is important to investigate whether addressing these potentially modifiable risk factors such as maternal smoking in pregnancy and high infant birth weight can reduce the risk of obesity in children and reduce the burden of intervention in the postnatal period.

There were several limitations regarding the study design and sampling in the MCS. The sampling in the MCS represents more deprived communities and ethnic minorities. Nearly 60% of children were from families with incomes of £21 800 ($33 292 USD) or less, and 17% were from ethnic minority families. BMI is known to systematically underestimate or overestimate adiposity in certain ethnic groups because of its association with height.^{51} Although BMI is highly correlated with direct measurements of adiposity, it is also influenced by lean body and bone mass. Another limitation of the study design was that maternal prepregnancy BMI was self-reported and therefore was subject to recall bias. There was no clinical validation of these measures or linkages to previous health records. Finally, infant growth was calculated by using a single cutoff to define rapid weight gain due to a single anthropometric assessment in the first year. Thus, the shape of the growth curves could not be taken into account.

There were also limitations of the analysis. Although the stepwise regression approach can minimize dependence of variables in the model, the indirect effects and path structure of the prediction variables was not determined. Prepregnancy weight is strongly associated with in utero overnutrition and high birth weight. Subsequently, both high birth weight and maternal prepregnancy BMI were also associated with childhood overweight. Although this does not detract from the predictive score, it is important to note that these factors are not independent. Further limitations include the unknown extent the model can predict longer term outcomes due to the study’s relatively short follow-up length. Future research could examine the accuracy by using studies with longer follow-up durations.

## Conclusions

A risk algorithm was based on several easily observable risk factors in the first year, which predicted childhood overweight. Using a prediction algorithm to identify at-risk infants could reduce levels of child obesity by enabling health professionals to target prevention more effectively. However, further research needs to evaluate the clinical validity, feasibility, and acceptability of communicating this risk.

## Acknowledgments

We thank Professor Aloysius Niro Siriwardena, Dr Barrie Edmonds, and Fiona Eve for their contributions, and NHS Nottinghamshire County PCT for its support of the work.

## Footnotes

- Accepted May 22, 2013.
- Address correspondence to Cris Glazebrook, PhD, Division of Psychiatry, Institute of Mental Health, University of Nottingham Innovation Park, Nottingham NG7 2TU, United Kingdom. E-mail: Cris.Glazebrook{at}nottingham.ac.uk
**FINANCIAL DISCLOSURE:**The authors have indicated they have no financial relationships relevant to this article to disclose.**FUNDING:**Supported by the National Institute for Health Research Collaboration for Leadership in Applied Health Research and Care – Nottinghamshire, Derbyshire, and Lincolnshire (NIHR CLAHRC-NDL).

## References

- ↵
- Statistics on Obesity

- ↵
- Hui LL,
- Schooling CM,
- Leung SS,
- et al

- ↵
- Stettler N,
- Zemel BS,
- Kumanyika S,
- Stallings VA

- ↵
- Baird J,
- Fisher D,
- Lucas P,
- Kleijnen J,
- Roberts H,
- Law C

- ↵
- Ong KK,
- Loos RJF

- ↵
- Monteiro PO,
- Victora CG

- ↵
- Ekelund U,
- Ong K,
- Linné Y,
- et al

- ↵
- Gardner DS,
- Hosking J,
- Metcalf BS,
- Jeffery AN,
- Voss LD,
- Wilkin TJ

- ↵
- Dixon JB

- ↵
- National Institute for Health and Clinical Excellence

- ↵National Audit Office. Tackling Child Obesity - First Steps. London, UK: House of Commons; 2006
- ↵
- Shribman S,
- Billingham K

- ↵
- Redsell SA,
- Swift JA,
- Nathan D,
- Siriwardena AN,
- Atkinson P,
- Glazebrook C

- ↵
- ↵
- Institute of Medicine

- ↵
- Weng SF,
- Redsell SA,
- Swift JA,
- Yang M,
- Glazebrook CP

- ↵
- Millennium Cohort Study First

- ↵
- Cole TJ,
- Bellizzi MC,
- Flegal KM,
- Dietz WH

- ↵
- Beyerlein A,
- Ness AR,
- Streuling I,
- Hadders-Algra M,
- von Kries R

- ↵
- ↵
- Cook NR

- Hippisley-Cox J,
- Coupland C

- ↵
- Pepe MS,
- Longton G,
- Janes H

- ↵
- ↵
- ↵
- ↵
- Rhee KE,
- De Lago CW,
- Arscott-Mills T,
- Mehta SD,
- Davis RK

- ↵
- Levine RS,
- Dahly DL,
- Rudolf MC

- ↵
- Centers for Disease Control and Prevention

- ↵
- Haemer M,
- Cluett S,
- Hassink SG,
- et al

- ↵
- Reinehr T,
- Kleber M,
- Lass N,
- Toschke AM

- ↵
- He M,
- Evans A

- Etelson D,
- Brand DA,
- Patrick PA,
- Shirali A

- Warschburger P,
- Kröller K

- ↵
- Jeffery AN,
- Voss LD,
- Metcalf BS,
- Alba S,
- Wilkin TJ

- ↵
- Patel AI,
- Madsen KA,
- Maselli JH,
- Cabana MD,
- Stafford RS,
- Hersh AL

- ↵
- Riley MR,
- Bass NM,
- Rosenthal P,
- Merriman RB

- ↵
- Hernandez RG,
- Cheng TL,
- Serwint JR

- ↵
- Johnston BD,
- Huebner CE,
- Anderson ML,
- Tyll LT,
- Thompson RS

- Talvia S,
- Räsänen L,
- Lagström H,
- et al

- Johnson Z,
- Molloy B,
- Scallan E,
- et al

- ↵
- ↵
- Oken E,
- Gillman MW

- ↵
- Catalano PM,
- Tyzbir ED,
- Allen SR,
- McBean JH,
- McAuliffe TL

- ↵
- ↵
- Jacobson JL,
- Jacobson SW,
- Sokol RJ

- ↵
- Karaolis-Danckert N,
- Buyken AE,
- Kulig M,
- et al

- ↵
- Johnson RK,
- Wang MQ,
- Smith MJ,
- Connolly G

- ↵
- Nightingale CM,
- Rudnicka AR,
- Owen CG,
- Cook DG,
- Whincup PH

- Copyright © 2013 by the American Academy of Pediatrics