Abstract
CONTEXT: Research regarding the protective effects of early physical activity on depression has yielded conflicting results.
OBJECTIVE: Our objective was to synthesize observational studies examining the association of physical activity in childhood and adolescence with depression.
DATA SOURCES: Studies (from 2005 to 2015) were identified by using a comprehensive search strategy.
STUDY SELECTION: The included studies measured physical activity in childhood or adolescence and examined its association with depression.
DATA EXTRACTION: Data were extracted by 2 independent coders. Estimates were examined by using random-effects meta-analysis.
RESULTS: Fifty independent samples (89 894 participants) were included, and the mean effect size was significant (r = –0.14; 95% confidence interval [CI] = –0.19 to –0.10). Moderator analyses revealed stronger effect sizes in studies with cross-sectional versus longitudinal designs (k = 36, r = –0.17; 95% CI = –0.23 to –0.10 vs k = 14, r = –0.07; 95% CI = –0.10 to –0.04); using depression self-report versus interview (k = 46, r = –0.15; 95% CI = –0.20 to –0.10 vs k = 4, r = –0.05; 95% CI = –0.09 to –0.01); using validated versus nonvalidated physical activity measures (k = 29, r = –0.18; 95% CI = –0.26 to –0.09 vs k = 21, r = –0.08; 95% CI = –0.11 to –0.05); and using measures of frequency and intensity of physical activity versus intensity alone (k = 27, r = –0.17; 95% CI = –0.25 to –0.09 vs k = 7, r = –0.05; 95% CI = –0.09 to –0.01).
LIMITATIONS: Limitations included a lack of standardized measures of physical activity; use of self-report of depression in majority of studies; and a small number of longitudinal studies.
CONCLUSIONS: Physical activity is associated with decreased concurrent depressive symptoms; the association with future depressive symptoms is weak.
- CI —
- confidence interval
- MDD —
- major depressive disorder
- PA —
- physical activity
Research interest in the health and psychological benefits of exercise has grown exponentially over recent years. Evidence suggests that physical activity may ameliorate depressive symptoms, supporting the use of exercise as part of a comprehensive treatment plan for major depressive disorder (MDD).1,2 The reverse association has also been demonstrated: decreased physical activity (PA), as well as increased sedentary behaviors, confers vulnerability for developing depressive symptoms.3–5 To date, studies have investigated whether increased PA may also protect individuals against the development of MDD, and findings from observational studies are promising.3,6–8 However, the age range of participants in these studies has been wide, research has been conducted principally in adult populations, and results have been conflicting.9–11 Thus, using the current state of the literature for the purpose of clinical decision-making is challenging. A meta-analysis is warranted to resolve discrepancies in the literature and to examine the suggestion that the largest magnitude of protective effect may be found at younger ages,12 which would in turn provide support for a potential preventative role of physical activity in the development of depression.
Two recent systematic reviews13,14 have reported that increased PA is associated with fewer depressive symptoms. However, only 1 review focused on the child and adolescent age group,13 and neither review conducted a meta-analytic synthesis of the data, which can provide a powerful estimate of the mean effect size across studies. Compared with adult participants, in which the investigation of risk factors is confounded by years of the allostatic load of depression (exposure to depressive symptoms and their associated physiologic strain)15 and comorbid cardiometabolic disease,16 studies of children and adolescents enable the examination of the relationship between PA and depressive symptoms at their most nascent. To our knowledge, this is the first study to conduct a meta-analytic review of the protective effect of PA on depression and, as such, is the first to describe the magnitude of this association. Also, previous systematic reviews have not explored the potential moderating role of sex in the association between PA and MDD, although a stronger effect for females has been suggested in several independent studies.4,17,18 Understanding if the association between PA and MDD is sex-specific is relevant for the elucidation of potential underlying mechanisms of association.
The objective of this meta-analysis was to investigate the potential preventative effect of child and adolescent PA on depression. Several variables have been linked to differences in effects size; thus, we will examine whether between-study differences were observed for child age, sex, and social risk.19–21 We will also examine if heterogeneity in effect sizes can be explained by variation in study methodology (eg, methods of assessing physical activity and depression), as well as study quality (eg, longitudinal versus cross-sectional). Clarification on the role of these factors for understanding systematic differences in effect sizes are important for the design and implementation of targeted and effective public health prevention programs.
Methods
Search Strategy
Published studies on PA and depression in children and adolescents were identified by searching Social Sciences Abstracts, International Bibliography of the Social Sciences, Scopus, SportDiscus, CBA Abstracts, Physical Education Index, Sociological Abstracts, and PsycINFO electronic databases for potential articles through October 2015. The search was limited to English language articles published between 2005 and 2015 using the keywords (“child*,” or “teen*,” or “adolesc*,” or “youth*,” or “infant,” or “infancy,” or “baby,” or “babies”) AND (“depress*”), AND (“sedentary behavio*” or “recreation” or “physical activity” or “leisure activity” or “exercise” or “fitness” or “sport*”). This search strategy yielded 3147 nonduplicate articles.
Study Inclusion and Exclusion Criteria
Titles and abstracts of the articles were reviewed to identify studies that met the inclusion criteria. Articles selected for the current study were based on the following criteria. (1) Cross-sectional study of PA and depression collected during childhood or adolescence (<18 years). (2) Longitudinal study of PA collected during childhood or adolescence (<18 years); (3) The constructs measured were PA (eg, energy expenditure) and depressive symptoms. Studies that measured broader, nonspecific constructs of either PA (eg, participation in extracurricular activities) or of depression (eg, psychological distress) were excluded. Because numerous standardized, validated and accessible measures of depression among youth are widely available, studies that assessed the outcome of depression by using a nonvalidated measure were excluded. Only 1 study22 needed to be excluded because it assessed depression by using a single self-report item with no demonstrated psychometric properties. In contrast to the depression literature, fewer standardized and validated measures exist for assessing physical activity. Thus, no validity criterion was applied to the measure of PA. However, a validated versus nonvalidated PA measure was examined as a moderator to determine if this measurement characteristic explained between study heterogeneity. (4) The study statistic could be transformed into an effect size (eg, correlations, odds ratios, means/SDs, and/or P values). (5) The full-text article was available and written in English. Studies in which PA was used as an intervention were not included in the current study.
Multiple results often emerge from a single dataset. If the same participants were used across multiple publications, only 1 study was included in the meta-analysis to ensure independence of effect sizes. A protocol was developed so that each sample of participants was only represented once in the meta-analysis. First, if a single dataset presented both cross-sectional and longitudinal analyses, we selected the study with longitudinal data because this study design was underrepresented in our analyses. Second, if multiple publications emerged from a single cross-sectional dataset, we selected the publication with the largest sample size and most comprehensive data extraction information.
Multiple samples or groups often exist within a particular study. For example, some studies present results separately for boys and girls within a sample. In such cases, effects sizes for both these nonoverlapping samples were calculated and entered into the meta-analysis separately.
Data Extraction
All articles that met inclusion criteria were coded by using a standard coding form to collect information on study and sample characteristics. Several moderator variables were collected to explain effect size variability across studies. Moderator variables were divided into categorical moderators (sex, social risk [ie, low income, minority, or involved in child protective services], PA type, PA validated measure, depression measure type, study design, and country) and continuous moderators (age at PA/depression, time between assessments, and publication year). Some studies reported data stratified by level of PA. In such cases, data for the group with the greatest PA were used in the analysis. This was done to remain consistent with our primary objective. Data extraction was performed by 2 independent coders (DK and MC). Discrepancies were resolved through discussion, and consensus scores were entered into the final dataset.
Data Analysis
Effect sizes were calculated and analyzed by using Comprehensive Meta-Analysis version 3.0 software.23 Effect sizes were calculated directly from information provided in each study. When provided, adjusted effect sizes were included. All effect sizes were transformed into correlations for the purpose of reporting mean effect sizes. Pooled effect size estimates were based on random effects model. We assessed for overall heterogeneity of the mean effect size using the Q statistic and by calculating the I2 statistic. The Q statistic is a test of the null hypothesis that all studies share a common effect size, and the I2 statistic examines the proportion of the variation across studies that is due to heterogeneity rather than chance, expressed as a percentage. General guidelines for the interpretation of the I2 are as follows: 25%, 50%, and 75% indicate low, moderate, and high heterogeneity, respectively.24 Categorical moderator analyses were conducted by using Q statistics,25,26 whereas the significance of each continuous moderator was assessed by using meta-regressions.27 Finally, we examined publication bias using funnel plots and Egger’s test.
Study Quality
To assess the quality of studies, a 7-point quality assessment tool was created based on those implemented in previous meta-analyses of observational studies.28,29 The tool evaluated the articles based on the following 7 criteria: (1) having a defined sample, (2) having a representative sample, (3) rater blinding, (4) report of relevant MDD and PA data, (5) adequate sample size, (6) statistical adjustment for covariates, and (7) a validated PA measure. Articles were given a score of 0 (“No”) or 1 (“Yes”) for each of the abovementioned criteria and summed to give a total score out of 7.
Results
Our electronic search of 7 databases yielded 3147 nonduplicate articles. On review of the titles and abstracts, 87 articles met inclusion criteria and full articles were retrieved. A total of 40 studies with 50 independent samples (89 894 participants) met the inclusion criteria and were included in analyses. Figure 1 presents a flowchart of the review process.
PRISMA flow diagram of the literature search used to identify studies for analysis of physical activity and depression.
Study and Sample Characteristics
Study Characteristics
As detailed in Table 1, 14 studies were longitudinal and 36 studies were cross-sectional. Sample sizes ranged from 55 to 14 594. Child age at the time of the assessment of PA ranged from 8 to 19 years. With respect to PA measures, 15 studies examined the frequency of activity only, 7 studies examined the intensity of the activity, and 27 examined a combination of frequency and intensity. With respect to the assessment of depression, 4 studies measured depressive symptoms by using interview methodology, whereas 46 studies used self-report questionnaires. The overall burden of depressive symptoms in studies that used a depression self-report measure was low (see Table 1). A clinical diagnosis of MDD was reported at follow-up for the 4 longitudinal samples that measured depressive symptoms by using a standardized interview. An MDD diagnosis was made in 5% to 13% of participants across these studies at follow-up.6,30,31 Although several studies specifically noted the absence of antidepressant medication use among participants, the large majority of studies did not include information regarding the use of medications.
Independent Samples Included in the Meta-analysis of Physical Activity and Depression
Study Quality
Validated measures of PA were used in 19 out of 36 (53%) cross-sectional studies and in 10 out of 14 (71%) longitudinal studies, as indicated in Table 1. The mean study quality score was 4.9 (SD = 0.9) out of 7. For cross-sectional studies, the mean percentage of participants with complete data were 96.6% (range: 68%–100%). For longitudinal studies, the mean rate of attrition between time points was 13.8% (range: 0.04%–30%). Additional detail regarding individual study- and item-level quality assessment scoring is summarized in Supplemental Table 6.
Overall Measure of Effect Size
A significant mean effect size for the association between PA and depression was found: (r = –0.14; 95% confidence interval [CI] = –0.19 to –0.10) (Fig 2), suggesting that children’s PA is negatively associated with depressive symptoms. The funnel plot revealed asymmetry (Fig 3) and Egger’s test suggested that the asymmetry was significant (P < .01). Using the trim and fill analysis, the adjusted pooled effect size estimate was r = 0.06 (95% CI = –0.11 to –0.01). Statistically significant heterogeneity between the studies was found (Q = 1767.95; P < .0001; I2 = 95.23) and potential moderator analyses were explored, including demographic, measurement, and study design factors. The results of all moderator analyses are presented in Tables 2 and 3, and significant moderators are discussed in detail below.
Forest plot of the overall mean effect size, as well as the effect size for each study included in the analysis. Observed effect sizes (r) and 95% CIs are indicated for each study included in the meta-analysis. The black diamond, located at the bottom of the forest plot, indicates the overall mean effect size. Inserting an average effect size across all stratified groups for studies that categorized PA into strata had no effect on the overall mean effect size (r = –0.14; 95% CI: 0.18 to 0.10).
Funnel plot of the meta-analysis of included studies. The y-axis on the funnel plot represents the SE, and the x-axis is the effect size. Observed studies are indicated by open circles. The white diamond represents the observed mean effect size, and the black diamond represents the adjusted mean effect size.
Examination of Potential Effect Modifiers in the Association of Physical Activity and Depression: Categorical Variables
Examination of Potential Effect Modifiers in the Association of Physical Activity and Depression: Continuous Variables
Effect sizes were stronger in samples using cross-sectional designs (k = 36, r = –0.17; 95% CI = –0.23 to –0.10) compared with those using longitudinal designs (k = 14, r = –0.07; 95% CI = –0.10 to –0.04), in which a weak inverse relationship between physical activity and future depressive symptoms was found. Similarly, studies that used interview-based MDD measures demonstrated weaker effect sizes compared with those that used questionnaires (k = 4, r = –0.05; 95% CI = –0.09 to –0.01 vs k = 46, r = –0.15; 95% CI = –0.20 to –0.10). Stronger effect sizes were also observed in samples with no known risks (k = 44; r = –0.15; 95% CI = –0.21 to –0.10) compared with samples with social risk (eg, low income) (k = 6; r = –0.05; 95% CI = –0.09 to –0.01). Effect sizes were stronger in samples examining a combination of PA frequency and intensity (k = 27; r = –0.17; 95% CI = –0.25 to –0.09) compared with intensity alone (k = 7; r = –0.05; 95% CI = –0.09 to –0.01). Finally, stronger effect sizes were found in studies that used validated (k = 29, r = –0.18; 95% CI = –0.26 to –0.09) versus nonvalidated PA measures (k = 21, r = –0.08; 95% CI = –0.11 to –0.05).
Longitudinal Studies
Because there were significant differences in effect sizes between cross-sectional and longitudinal studies, and because longitudinal associations may provide insight into the directionality of associations, we performed a set of subanalyses with longitudinal studies only to more explicitly examine the magnitude of the association, as well as the between-study variability, for studies assessing a baseline metric of physical activity and its association with later depressive symptoms. There were 14 studies involving 15 926 participants that reported on longitudinal associations between PA and depression. Five studies6,8,30,64,66 reported on depression-related covariates, including baseline depressive symptoms, number of weeks depressed during the preceding year, body dissatisfaction, social support, self-efficacy, history of childhood trauma or stressful life events, and medication status (Table 1).
The mean effect size for the longitudinal association between PA and depression was r = –0.07 (95% CI = –0.10 to –0.04). Statistically significant heterogeneity between studies was found (Q = 59.25; P < .0001; I2 = 77.52) and potential moderator analyses were explored (Tables 4 and 5). However, because the number of studies for several subgroups was small (eg, there were only 2 studies with social risk), the results of these moderator analyses should be interpreted with caution (Table 5).
Examination of Potential Categorical Effect Modifiers in Studies With Longitudinal Associations Between Physical Activity and Depression
Examination of Continuous Moderators in Studies With Longitudinal Associations Between Physical Activity and Depression
Discussion
This systematic review and meta-analysis of 50 samples involving 89 894 participants found that a greater PA level was associated with fewer depressive symptoms, although not with decreased diagnoses of MDD. This association was stronger for cross-sectional studies than for longitudinal studies, in which the mean effect size was significant, but weak. The nature of the PA was also associated with the presence of depressive symptoms, in that PA of increased frequency and intensity was more strongly associated with decreased depressive symptoms compared with PA that was defined by intensity of activity alone.
Significant effect sizes were observed for studies that examined depressive symptomatology by using questionnaire measures and were considerably stronger than those of studies assessing MDD by using interview measures. Indeed, the majority of studies in this meta-analysis employed self-report inventories to assess depressive symptoms (n = 46) rather than diagnostic interviews (n = 4), which are considered to be the gold-standard measure for MDD. Self-report measures are frequently used in research studies due to their ease of administration, low cost, minimal time requirement, and low patient response burden. These measures are useful screening tools; however, self-report instruments are limited by their inability to confirm the presence or absence of an MDD diagnosis. That increased PA was more highly associated with decreased depressive symptoms in this meta-analysis, as compared with an MDD diagnosis, is a critical finding. This finding suggests that individuals who are at risk for more severe, syndromal-level symptom burden, impairment, and associated poor health outcomes may not respond to the potential preventative effects of PA. Although it is possible that these results may also reflect the relative methodological limitations associated with the examination of a dichotomous versus a continuous variable, our findings are consistent with previous data reporting that MDD severity is distinguished from subsyndromal depressive symptoms by its decreased sensitivity to prevention strategies, greater association with cardiovascular risk factors and health outcomes, and greater treatment resistance.67–70
Increased PA was more strongly associated with decreased depressive symptoms in cross-sectional studies compared with longitudinal studies, where the effect size was small. Cross-sectional studies are limited in their ability to probe causality, because the temporal relationship between variables cannot be determined. Thus, it is possible that the cross-sectional studies included in this meta-analysis are actually indicative of the reverse association of PA and depression: that children and adolescents with increased depressive symptoms are less likely to participate in PA. Indeed, amotivation, pessimism, and anhedonia associated with the depressed state have been reported to lead to decreased PA among adult populations.71 In contrast, longitudinal studies provide insight into the direction of the association and, in the present meta-analysis, demonstrated a weak inverse relationship between PA and future depressive symptoms measured 2 to 17 years later, suggesting that PA has a weak but positive association with future mood.
Studies that included a measure of both increased PA frequency and intensity demonstrated stronger associations with depressive symptoms than those that used measures of intensity alone. This finding is consistent with other systematic reviews examining the role of PA as an intervention for depressed adults.1 Currently, some clinical guidelines recommend the inclusion of 45 minutes of moderately intense exercise at least 3 days per week in the treatment of MDD among adults.72 In contrast, guidelines for general health promotion by the Canadian Pediatric Society73 and American Academy of Pediatrics74 recommend that children and adolescents get at least 60 minutes of moderate to vigorous PA daily to maintain general health. As such, the findings from the current study support the inclusion of both the PA frequency and intensity components in the Canadian Pediatric Society and American Academy of Pediatrics recommendations with respect to the benefit to depressed mood. Many hypotheses regarding the mechanism by which PA may lead to improved mood have been theorized, including via antiinflammatory effects, increased growth factors leading to neural plasticity, neuroendocrine effects on the hypothalamic-pituitary-adrenal axis and insulin sensitivity, and improvements in self-efficacy.75–77 However, neither the pathophysiological pathways themselves nor whether they are specific to mood state are known. These factors are important for determining rational prevention versus treatment strategies, gaining insight into the etiology of depression, and for research into novel treatments for depression for medically ill populations and those unable to participate in PA.
Studies that examined the association of PA with depression in samples of higher social risk (eg, low income, minority, or involved in child protective services) reported weaker effect sizes than those of lower-risk groups. Socioeconomic status and its associated risk factors (eg, disadvantaged neighborhoods) explain a significant proportion of the variance in childhood psychopathology, including depression.78 Because children in high–social risk environments may be exposed to many more risk factors for depression, including lower socioeconomic status,79 increased PA may have relatively less influence with respect to the proportion of the variance in depression it explains when compared with children of lower social risk.80 Also, because measures of depression and PA have traditionally been developed in samples of low social risk, they may be less well calibrated to capture the variation in depression or PA seen in high–social risk children.81,82 These results should be interpreted with caution, however, because few studies have examined the association of PA with depression in high–social risk samples. Given the increased prevalence of both depression and obesity in populations of high social risk, however, additional research examining potential targets for prevention among this vulnerable group of children is needed.
As the first study to conduct a meta-analytic review of the potential protective association of childhood PA with depression, this study has many strengths, including the analysis of a large number of studies to increase the precision of effect size estimates, subanalysis of cross-sectional versus longitudinal associations, and examination of PA frequency and intensity as potentially contributing effect modifiers. However, our findings must be interpreted within the context of the limitations of this study. The measurement of PA in the majority of studies relied on self-report measures of frequency, intensity, and type of activity, which were not correlated with objective measures of activity (eg, accelerometry). This also reflects a limitation of the PA literature more broadly, in that the use of standardized instruments that have demonstrated reliability and validity was not consistent across studies. The current meta-analysis demonstrated that studies with validated measures of PA had stronger effect sizes than those that used nonvalidated measures. Thus, future PA research should focus on the methodology for PA measurement in children and adolescents to increase confidence in the study results. In addition, the majority of the literature relies on the self-report of depressive symptoms, with few studies able to confirm a diagnosis of depression, leading to wide precision estimates of the magnitude of the effect of PA on clinical depression. Finally, we only included studies that were published in English, and this inclusion criterion may limit the generalizability of our findings to predominantly English-speaking countries.
Conclusions
This systematic review and meta-analysis finds that increased PA in childhood and adolescence is associated with decreased depressive symptoms. Substantive moderators of this association include (1) study design, with the strongest association found in cross-sectional studies; (2) type of PA, with a combination of PA frequency and intensity resulting in the greatest effect on depressive symptoms; and (3) depression measure, with a stronger protective effect of increased PA for depressive symptoms than for a clinical diagnosis of MDD. Taken together, this study suggests that PA in childhood and adolescence is associated with improved concurrent symptoms of depression, particularly when undertaken regularly and with vigor, and has weak but significant effects on future depressive symptoms. Future research is needed to advance the knowledge of PA measurement, elucidate the mechanism of association between PA and depression, and examine the longitudinal relationships between PA, depression, and health outcomes to determine the critical periods in which preventative efforts may be most effective.
Acknowledgment
We thank Ms Qi Fang (University of Toronto) for assistance in the literature search.
Footnotes
- Accepted January 6, 2017.
- Address correspondence to Daphne J. Korczak, MD, MSc, Department of Psychiatry, The Hospital for Sick Children, 555 University Ave, Toronto, ON M5G1X8, Canada. E-mail: daphne.korczak{at}sickkids.ca
FINANCIAL DISCLOSURE: The authors have indicated they have no financial relationships relevant to this article to disclose.
FUNDING: Research support was provided to Dr Madigan by the Alberta Children’s Hospital Foundation and the Canada Research Chairs program.
POTENTIAL CONFLICT OF INTEREST: The authors have indicated they have no potential conflicts of interest to disclose.
References
- Copyright © 2017 by the American Academy of Pediatrics