The immediate challenge arising from endemically high rates of childhood obesity is to develop, test, and implement effective treatments that reduce its negative consequences on future health and costs. However, there remain no ready “fixes,” as most trials report at most small BMI benefits despite substantial intervention input.
Given the opportunity costs of ineffective interventions, public health providers should rightly query value for money if interventions deliver change that barely exceeds natural history and/or is insufficient for health benefit. However, such judgments require knowledge of the natural history of BMI change in community samples of overweight and obese children participating in research studies. Unfortunately, regression to the mean and biases favoring resolution such as the Hawthorne effect seem no less likely in obesity than with any other chronic disease.
For example, the most recent Cochrane systematic review of childhood obesity treatment1 included lifestyle trials comparing 2 or more interventions head-to-head, as well as trials comparing interventions with true controls. The former generally concluded that both interventions were equally effective, on the basis of mean reductions occurring within both intervention groups. In contrast, those studies with true controls tended to reveal that the interventions were ineffective, even though average BMI reductions were similar to those observed in the comparative trials. Thus, the meta-analysis of behavioral interventions in trials with true controls revealed only a small favorable between-group effect of −0.06 in BMI z score (95% confidence interval [CI]: −0.12 to −0.01).1 To define intervention success, one should also understand what BMI change equates to improved health. Although this has no definitive answer, and to some extent depends on the baseline value, it is suggested that BMI z score (considered by some a better marker of fat loss than change in BMI, weight, or weight z score) should fall by at least 0.5 to 0.6 to be confident of reductions in fat and cardiovascular risk in obese children.2 Because early effects (eg, at 6 months) in many trials are no longer evident by 12 months,1 many also consider that BMI improvement should be sustained for at least 12 months from baseline for the intervention to be considered effective.
To illustrate the dangers of interpreting group changes in BMI without a control group, we summarize BMI z score changes in community samples of overweight and/or obese children in 4 Australian studies, comprising Australia’s largest observational children’s cohort study and our own 3 recent randomized trials. Specifically, we report the change in BMI z scores both continuously and categorically (percentage revealing any, ≥0.25, and ≥0.5 reductions in BMI z score) over a 15-month period from baseline, to ensure 12 full months of follow-up. We hypothesized that children in both study types would show reductions in mean BMI z score in keeping with regression to the mean, with larger reductions in trials due to additional biases (eg, Hawthorne effect, selection bias). All were approved by human ethics committees, and parents provided written informed consent.
We drew on person-level data for the overweight/obese participants in waves 1 to 4 of the kindergarten cohort of the nationally representative Longitudinal Study of Australian Children (LSAC),3 a broadly focused observational study that recruited 4983 4- to 5-year-olds in 2004 then remeasured their height and weight every 2 years. We also pooled person-level data from our 3 randomized trials (LEAP 1, 20024; LEAP 2, 2005/20065; HopSCOTCH, 2009–20116) of lifestyle interventions nested within BMI surveillance studies in family practice; all had null results, with minimal evidence of differences in average BMI z score reduction between treatment and control groups (Fig 1, top grouping). LEAP 1, LEAP 2, and HopSCOTCH were approved by The Royal Children’s Hospital Human Ethics Committee (21019, 25006, and 280178) and LSAC by the Australian Institute of Family Studies Ethics committee.
All participants were aged 5 to 10 years, and overweight or obese by International Obesity Task Force cut points at baseline (LSAC, n = 1816; LEAP 1, n = 146; LEAP 2, n = 242; HopSCOTCH, n = 77; retention, 89%–94%). In all studies, children were weighed at baseline and follow-up in light clothing to the nearest 50 g on digital scales and measured to the nearest 0.1 cm with a rigid stadiometer. Follow-up occurred ∼24 and 15 months later in LSAC and the trials, respectively. BMI was age- and gender-standardized by using Centers for Disease Control and Prevention reference values. Change in BMI z score was rescaled to the expected change had each child been followed for 15 months. We deemed z score changes of more than 2 SD implausible, resulting in exclusion of 2 children. For the observational data, change in BMI z score was calculated where any 2 consecutive waves of person-level data were available. Trial data were analyzed as a single pooled data set. Analyses used Stata 12.1 (Stata Corp, College Station, TX).
The overweight and obese children in the observational and trial samples had similar baseline BMI z scores (Table 1). In all 6 continuous analyses (overweight, obese, and overweight/obese groups for the observational and intervention studies separately), mean BMI z score reduced slightly over time. Mean changes were smaller in the observational (−0.05 to −0.10) than the intervention (−0.12 to −0.14) studies, but individuals varied markedly. In the observational study, ∼62% showed any reduction in BMI z score, whereas 16% fell by at least 0.25 and 4% by at least 0.5. The equivalent figures for the intervention trials were 68% (any reduction), 30% (≥0.25), and 11% (≥0.5). Strengths of our analyses include the large harmonized samples of overweight and obese children. Factors mitigating against participation bias in LSAC include its breadth and absence of obesity focus. However, we acknowledge lack of body composition data regarding actual adiposity, and were not able to consider other important health outcomes such as fitness and self-esteem.
Thus, in both the intervention and observational studies, there was invariably a reduction in mean BMI. This would be expected by regression to the mean, given selection on the basis of high BMI, with additional factors such as the Hawthorne effect likely influencing the slightly larger reductions in the trials. These largely artifactual changes in BMI should be expected but appear often to be overlooked.
These data raise interesting questions about the efficacy reported by current tertiary obesity clinics. For example, a pooled sample from 175 German-speaking tertiary pediatric obesity clinics achieved a mean reduction of −0.15 BMI z score after a median of 1.2 years (interquartile range, 0.9–2.2; see lower part of Fig 1).7 The natural tendency for high BMI z scores to fall somewhat could even be heightened in tertiary clinics because one might expect (and our analyses confirm) larger natural improvements in treatment than observational studies, and in obese than overweight children.
We conclude that accounting for the natural tendency for BMI in groups selected on the basis of overweight to fall a little could prevent implementation of interventions that could never hope to deflect the course of the obesity epidemic. Although unlikely to be harmful to individuals,4–6 scaling even low-intensity interventions to the population is not only costly but may subsequently impede more effective interventions because of competing workforce demands and/or the challenges of decommissioning services that may erroneously be perceived as beneficial. These are not trivial issues.
Three recommendations arise. Firstly, it seems not only ethical but necessary to adequately control community-based intervention trials. We have not found this to be particularly costly, or to be unpalatable to participants. In the absence of controlled comparisons, we suggest researchers and policy makers should use other means to estimate attributable fractions and actual health gain over and above that expected without intervention. Secondly, for services where traditional controlled trials may be impossible, designs could be used such as multiple baselines (providing a good mechanism for assessing the extent of regression to the mean) and stepped wedge (incorporating controls into service roll-out). If clinical services could be resourced to conduct continuous networked research, then incremental treatment improvements could be measured over long time horizons (an exceptionally successful approach in other conditions such as childhood cancer). Finally, a proportion of obese children in every study naturally achieve substantial improvement that would be expected to confer important health benefit; research could focus on how body set-point change is achieved in this unusual subgroup.
We thank all the parents and children who took part in all 4 studies.
- Accepted October 16, 2014.
- Address correspondence to Melissa Wake, MD, Centre for Community Child Health, Royal Children’s Hospital, Flemington Road, Parkville 3052, Australia. E-mail:
Dr Wake conceptualized the article, and drafted and revised the manuscript; Dr Clifford helped conceptualize the article, conducted analyses, and drafted, reviewed, and revised the manuscript; Ms Jachno helped conceptualize the article, conducted analyses, and reviewed and revised the manuscript; Ms Lycett and Ms Baldwin helped conceptualize the article, conducted a review of the literature, prepared the data for analyses, and reviewed and revised the manuscript; Dr Sabin provided critical input, and reviewed and revised the manuscript; Dr Carlin helped conceptualize the article, provided critical input, oversaw the analyses, and reviewed and revised the manuscript; and all authors approved the final manuscript as submitted.
FINANCIAL DISCLOSURE: Dr Wake was supported by National Health and Medical Research Council (NHMRC) Population Health Career Development award 546405 and Senior Research Fellowship 1046518; Dr Sabin was supported by NHMRC Postdoctoral Training Fellowship 1012201; the other authors have indicated they have no financial relationships relevant to this article to disclose.
FUNDING: LEAP1 was funded by a grant from the Australian Health Ministers’ Advisory Council for Priory Driven Research (AHMAC PDR 2991/15), LEAP2 by NHMRC project grant 334309, and HopSCOTCH by NHMRC project grant 491212. This article uses unit record data from Growing Up in Australia: The Longitudinal Study of Australian Children, which is conducted in partnership between the Department of Social Services, the Australian Institute of Family Studies, and the Australian Bureau of Statistics. Research at the Murdoch Childrens Research Institute is supported by the Victorian Government’s Operational Infrastructure Support Program. The findings and views reported are those of the authors and should not be attributed to the Department of Social Services, the Australian Institute of Family Studies, or the Australian Bureau of Statistics. Funding bodies had no influence on the design and conduct of the study; collection, management, analysis, and interpretation of the data; nor the preparation, review, or approval of the manuscript.
POTENTIAL CONFLICT OF INTEREST: The authors have indicated they have no potential conflicts of interest to disclose.
- Oude Luttikhuis H,
- Baur L,
- Jansen H,
- et al
- Hunt LP,
- Ford A,
- Sabin MA,
- Crowne EC,
- Shield JP
- Soloff C,
- Lawrence D,
- Johnstone R
- Wake M,
- Baur LA,
- Gerner B,
- et al
- Wake M,
- Lycett K,
- Clifford SA,
- et al
- Copyright © 2015 by the American Academy of Pediatrics