OBJECTIVE. This article responds to evidence gaps regarding language impairment identified by the US Preventive Services Task Force in 2006. We examine the contributions of putative child, family, and environmental risk factors to language outcomes at 24 months of age.
METHODS. A community-ascertained sample of 1720 infants who were recruited at 8 months of age were followed at ages 12 and 24 months in a prospective, longitudinal study in metropolitan Melbourne, Australia. Outcomes at 24 months were parent-reported infant communication (Communication and Symbolic Behavior Scales and MacArthur-Bates Communicative Development Inventories vocabulary production score). Putative risk factors were gender, preterm birth, birth weight, multiple birth, birth order, socioeconomic status, maternal mental health, maternal vocabulary and education, maternal age at birth of child, non–English-speaking background, and family history of speech-language difficulties. Linear regression models were fitted to total standardized Communication and Symbolic Behavior Scales and Communicative Development Inventories vocabulary production scores; a logistic regression model was fitted to late-talking status at 24 months.
RESULTS. The regression models accounted for 4.3% and 7.0% of the variation in the 24-month Communication and Symbolic Behavior Scales and Communicative Development Inventories scores, respectively. Male gender and family history were strongly associated with poorer outcomes on both instruments. Lower Communication and Symbolic Behavior Scales scores were also associated with lower maternal vocabulary and older maternal age. Lower vocabulary production scores were associated with birth order and non–English-speaking background. When the 12-month Communication and Symbolic Behavior Scales Total score was added as a covariate in the linear regression of 24-month Communication and Symbolic Behavior Scales Total score, it was by far the strongest predictor.
CONCLUSIONS. These early risk factors explained no more than 7% of the variation in language at 24 months. They seem unlikely to be helpful in screening for early language delay.
Preschool children with expressive and/or receptive language impairment are at high risk for subsequent difficulties with language and language-related tasks in their later school careers that, for many, persist into adulthood.1,2 Much earlier identification would be ideal to increase the likelihood of altering outcomes by the preschool years.
Unfortunately, the very limited understanding of the natural history of language delay from infancy makes early identification difficult. Wide ranges in the prevalence of language delay are reported, with high rates of resolution during the early years. At 8 to 10 months of age, ∼30% of infants have been reported to have early delay in communication skills, which persists in 50% to 80% of these children at 2 years of age.3,4 Expressive language delay, or “late talking,” reported to affect ∼15% of 2-year-olds,5 also resolves in 40% to 60% of children by 3 years and 70% by 4 years of age.6 It is not possible to identify reliably which trajectory (recovery or persistence) individual or groups of infants and toddlers who are at risk for early delay might follow.
The design of effective preventive or treatment programs that target the right children (ie, those who will go on to have lasting language impairment) must be based on an understanding of the natural history of language delay and of the early features that most accurately identify these children. Although the early markers of possible later language impairment seem to have reasonable sensitivity, their specificity is uniformly low.4
In 2006, Nelson et al7 published a systematic evidence review for the US Preventive Services Task Force (USPSTF) on screening for speech and language delay in preschool children. The review sought to appraise the strengths and weaknesses of current evidence regarding the effectiveness of screening and interventions for speech and language delay. Four of the 8 key questions addressed in the review concerned screening for early speech and language delay.7
No studies that directly addressed whether screening for speech and language delay results in improved speech, language, or other outcomes were identified; however, 2 sets of risk factors that might improve the accuracy of screening were identified. The first set (factors consistently reported in the literature to identify children at risk) included family history of speech and language delay, male gender, parent educational levels, and perinatal factors. The second set (less consistently reported) included childhood illnesses, later birth order, family size, older parents or younger mother at birth, lower socioeconomic status (SES), and minority race. The task force concluded that the role of these risk factors in screening was unclear and that a list of risk factors had not been developed or tested for selective screening for speech and language delay.7
Only 48–11 of the 16 studies reviewed by the USPSTF7 considered risk factors in children who were ≤24 months of age. Of these 4, 1 focused solely on stuttering,9 and the remaining 3 differed markedly in the speech and language domains studied and in the derivation and composition of the samples. Surprisingly, none investigated the contribution of gender, SES, birth order, perinatal factors, or parental education. Family history was explored in 2 studies, with an association with language delay found in 111 but not the other.8 There are inherent problems in interpreting published data on risk factors.4,7 These include the variety of study designs, the heterogeneity of the populations studied, variable inclusion and exclusion criteria, and noncomparable outcomes, which makes interpretation and comparison extremely difficult. For example, studies vary in whether they examine the risk for delay for vocabulary12 speech,13 or language14–16 or are still broader and include stuttering9 or delays in learning.12,14–16 Not surprisingly, 1 of the main recommendations from the task force review was the need for prospective research to identify and quantify the predictive strength of risk factors in screening for speech and language delay. The study reported here addresses this recommendation.
This article focuses on quantifying the contributions made to language outcomes at 24 months of age by the early risk factors identified by the task force as likely to influence language development. It builds on a previous article17 from the same longitudinal study that found that although a range of child, family, and environmental factors explained a small amount of variation (<6%) in communication skills at 12 months, the strongest predictor (accounting for 37% of the variation) was communication development at 8 months of age. The authors discussed 2 possible explanations for the findings. First, early communication development may have a substantial biological component, given that so little of the variability was explained by the combination of factors explored.17 In support of this, recent neuroimaging studies18 have found decreased white matter volumes in the motor and language areas of children with developmental language disorder compared with typically developing control subjects. Others have hypothesized that a common infrastructure that is available to children equips them to acquire language during early childhood.19 Second, they speculated that as language acquisition stabilized across the first few years of life, it might become possible to elucidate a combination of early communication skills and biological and environmental factors that more reliably predict later language difficulties.17
Sampling and Participants
The longitudinal Early Language in Victoria Study (ELVS) was established in 2002. Sampling methods have been reported elsewhere.17 Briefly, a community sample of infants who were aged 7.5 to 10.0 months were recruited from 6 of 31 local government areas (LGAs) in metropolitan Melbourne (population 3.8 million) in the state of Victoria, Australia. These LGAs were spread geographically across the spectrum of disadvantage–advantage, having been sampled by stratifying the 31 LGAs into 3 tiers according to the Australian census-based Socio-Economic Indexes for Areas (SEIFA) Index for Relative Socio-Economic Disadvantage (representing attributes such as low income, low educational attainment, and high unemployment),20 and then selecting 2 noncontiguous LGAs from each tier.
Between September 2003 and April 2004, potential study participants were recruited in 1 of 3 ways: via the Maternal and Child Health nurses; via universally available hearing screening sessions, offered at ages 7 to 9 months; or as a result of publicity. Infants were excluded when they had developmental delay (eg, Down syndrome), cerebral palsy, or other serious intellectual or physical disability or when their parents did not speak and understand English. Participation was maximized by ensuring that questionnaires were written at no more than a year 6 reading level. Data were collected on a broad range of child, family, and environmental factors at 8, 12, and 24 months of age (data collection from 3 through 7 years is ongoing from 2005 to 2010). This article draws on communication and language data collected at ages 12 and 24 months and demographic, family, and environmental information collected at ages 8 and 12 months. Figure 1 describes the flow of children from recruitment through 24 months of age, with those still participating at 24 months composing the sample on which these analyses were based.
The ELVS was approved by the ethics committees of the Royal Children's Hospital (Melbourne) and La Trobe University, and all parents provided written, informed consent.
Parents completed the Communication and Symbolic Behavior Scales (CSBS) Infant-Toddler Checklist21 at both 12 and 24 months. Standardized total scores (normative mean: 100; SD: 15) and 3 composite scores (normative means: 10; SD: 3) for the domains of social, speech, and symbolic skills were calculated according to the manual. The composite domains broadly relate to infants' prelinguistic, linguistic, and cognitive abilities, respectively, each of which has been demonstrated to relate to later expressive language development.21 Parents also completed the Words and Sentences version of the MacArthur-Bates Communicative Development Inventory (CDI) for infants at 24 months.22 To accommodate differences between American and Australian usage, we received permission (from the authors) to substitute 24 vocabulary items on the Words and Sentences inventory (eg, “footpath” for “sidewalk”). Only the expressive vocabulary production scores were used in our analyses. Raw (quantitative) scores were calculated for the CDI. As is usual practice with the CDI, children who were at <10th centile for vocabulary production were identified as late talkers.22
Putative Risk Factors
All putative risk factors identified by the USPSTF were considered, with the exception of child illnesses (not collected by the ELVS) and family size (given its high correlation with birth order during infancy), as shown in Table 1. Data for the 12 putative risk factors in the ELVS were drawn from both the 8- and 12-month questionnaires. Our indicator of minority status was whether the main language spoken in the home to the child was English or not English, and the indicator of SES was the continuous SEIFA Index of Relative Disadvantage20 categorized for analysis using quintiles based on SEIFA values for the Victorian population in 1996 with lower scores representing greater disadvantage. In addition to those factors considered by the task force, we studied the contributions of maternal mental health23 (because maternal depression influences mother–child interaction) and maternal vocabulary (as a proxy for maternal cognition, given the strong heritability of intelligence). Maternal mental health was measured with the Nonspecific Psychological Distress Scale.24 Scores were divided into likely mental health problem (a score of ≥4 of a possible 24) versus no mental health problem (≤3). Maternal vocabulary was measured with the written 44-item multiple-choice modified version of the Mill Hill Vocabulary Scale25 with each correct answer tallied to provide a raw quantitative score out of the possible 44.
Scores on the outcome variables of interest for children who were born preterm (defined as <36 weeks' gestation) were age-corrected before analysis. Random-effects linear regression models26 were fitted to the total CSBS score, each of the 3 composite CSBS scales, and the CDI vocabulary score at 24 months, using the 12 putative risk factors simultaneously as covariates. The method allows for potential correlation between the responses of twins. Additional models were then fitted for each of the CSBS outcomes, identical to the initial models except that the corresponding 12-month CSBS score was included as an additional covariate to quantify the extent to which its predictive strength is dominant over the risk factors. An extended model that included the 12-month CSBS as a predictor was also fitted to the CDI vocabulary score. A logistic regression model was fitted to identify which of the risk factors was associated with late-talking status at 24 months. Information sandwich estimates of SE27 were calculated for this analysis to allow for correlation between twins. Again, an additional logistic regression was then fitted with the 12-month CSBS total score additionally included as a covariate. Analyses of the 24-month CSBS outcomes were restricted to cohort members who had their 24-month assessment between the ages of 23.5 and 25 months, and those of CDI vocabulary production and late-talking status were restricted to those who completed 24-month assessments between 23.5 and 25.5 months. Analyses in which the 12-month CSBS scores were also included as predictors were further restricted to those who had their 12-month assessment between the ages of 11.5 and 13.5 months. Analyses were implemented by using Stata 9.2.28R2 values and residual checks for quantitative outcomes were based on ordinary linear regression because the coefficients were essentially the same as in the random-effects regression models that allowed for correlation between responses from twins. The squared Pearson correlation measure of R2 was calculated for the logistic regression analysis.29 Partial R2 values for individual risk factors are not shown, but their relative predictive strength may be assessed by ranking the P values in order of size. Unstandardized coefficients are reported for the linear regression analyses.
Participant characteristics are shown in Table 2. There were 21 twin pairs in the study and 1 member of another twin set, making a total of 43 nonsingletons in the study. The mean (SD) Index for Disadvantage score was 1037.6 (59.7), slightly higher than that for all metropolitan Melbourne (1020.6 [66.4]); although the spread of values was similar, >80% were in the 3rd, 4th, and 5th (ie, less disadvantaged) quintiles.
Table 3 summarizes the CSBS standardized scores and CDI raw vocabulary scores at 24 months. Table 4 shows the results from the regression analyses for the 24-month CSBS total and CDI vocabulary production outcomes. Female gender and higher maternal vocabulary were associated with higher CSBS scores at 24 months of age, whereas family history of speech and language difficulties and older maternal age were associated with lower CSBS scores. Graphic investigation using locally weighted scatterplots30 suggested that the negative linear association between maternal age and the CSBS was mainly for mothers who were ≥30 years of age. For younger mothers, there was no marked relationship. The model fitted to the CSBS total score accounted for just 4.3% of the variation. Factors that were associated with higher CDI vocabulary production scores at 24 months included female gender, birth order (being fifth born), and English-speaking background, whereas family history of speech and language difficulties predicted lower CDI scores. The model explained 7.0% of the variation in CDI vocabulary production at 24 months.
A total of 19.7% (333 of 1691) of children were classified as late talkers. Risk factors associated with late-talking status in the logistic regression analysis (Table 5) included non–English-speaking background, family history of speech and language difficulties, and low maternal education (≤12 years). The variation explained by the model was 4%.
Table 6 displays the linear regression models for each of the 3 CSBS composites (social, speech, and symbolic). The 12 putative risk factors accounted for 2.5% of the variation in the social composite score, 6.4% on the speech composite score, and 7.0% on the symbolic composite score. Higher maternal vocabulary score was the only factor associated with higher outcome scores on all 3 composites (see Table 3). Two factors, being female and mother's education level (≥13 years), were strongly associated with higher scores on the speech and symbolic composites, whereas having a family history of speech and language difficulties was associated with lower scores on both composites.
When the 12-month CSBS total score was included in the regression model of the 24-month CSBS total score, it was the strongest predictor (regression coefficient = 0.52; 95% confidence interval [CI]: 0.47 to 0.58; partial R2 = 19.6%). Similarly, when the models for each of the 24-month CSBS composite scores were adjusted for the relevant 12-month composite score, the amounts of variation explained increased (social score: regression coefficient = 0.60, 95% CI: 0.53 to 0.67, partial R2 = 15.3%; speech score: regression coefficient = 0.49, 95% CI: 0.42 to 0.56, partial R2 = 9.5%; symbolic score: regression coefficient = 0.48, 95% CI: 0.40 to 0.55, partial R2 = 9.1%). When the 12-month CSBS total score was included as a predictor in the linear regression of the 24-month CDI score, the corresponding regression coefficient was 4.9 (95% CI: 4.3 to 5.4) and the partial R2 was 14.2%. Finally, when the 12-month CSBS total score was included in the logistic regression model of late-talking status, it was the strongest predictor (odds ratio: 0.95; 95% CI: 0.94 to 0.96; partial R2 = 5.3%).
When the 12 early risk factors that are widely postulated to predict language outcomes in preschool children were studied concurrently, none was a strong predictor of communication and vocabulary skills in 24-month-old children. The variation explained (4.3% and 7.0% for the CSBS and CDI scores, respectively) in the linear regression models was shared by the 12 putative risk factors, and the variation explained by any 1 risk factor was small. In contrast, communication skills that already were achieved at 12 months of age explained one fifth of the variation in 24-month outcomes. Thus, communication score at 12 months was a much better predictor of outcome at 24 months than the 12 putative risk factors collectively. Little variation was explained by the risk factors collectively when considering the outcome of late-talking status at 24 months.
These findings are in accordance with our previous report17 in 12-month-old infants. Here we present additional evidence that in a community sample followed through the first 2 years of life, there seems to be a strong biological trajectory for communication skill development and vocabulary production that seems relatively unaffected by a range of child, family, and environmental variables.
Our findings are in contrast to much of the published literature and accepted views on the subject. It seems that assumptions about risk factors for early language delay may have been based largely on the studies that involved older children. In the USPSTF article,7,31 few studies reviewed considered risk factors in younger children (eg, <24 months of age). We suggest that at these older ages, the evidence remains inconsistent, even for the 2 most-studied potential risk factors. Of the 9 studies that investigated family history of speech and language difficulties, a significant association was reported in 7 (although not all addressed the same speech and language domains); of the 7 that investigated parental education, 5 found an association. It is difficult to interpret the data on parental education because some studies measured only maternal education, whereas others considered that of both parents. In 1 study,32 a significant association was found for maternal but not paternal education.
Strengths of this study include its prospective, longitudinal design; its community-ascertained sample; and the young age (8 months) at which participants were recruited. The findings are likely to be generalizable, given the community nature of the sample and that the prevalence of late talkers (19.7%) and spread of CSBS scores at 24 months of age were broadly similar to those in other studies.33 Furthermore, this is the first study to measure concurrently the broad range of variables that were recently identified as requiring additional study by a major systematic review.
Relative weaknesses include that although we used validated measures that are widely considered reliable, the language outcomes of interest relied solely on parent report. Face-to-face assessments will be conducted at 4 years of age to determine the impact that these factors have on later language development measured more objectively. It is possible that as children acquire more spoken language, risk factors that seemed to contribute little to development at 12 and 24 months may prove to be important predictors of language development by 4 years of age. This could partly reflect simple measurement issues: by 4 years, language and communication development can be formally tested using measures that are psychometrically more reliable and stable. Furthermore, although children seem to be equipped with the basic infrastructure and primed to acquire language, activation and acceleration rates may differ during the early years and may also be disrupted.19 On the basis of current evidence, we expect that many but not all of our late-talking 2-year-olds (19.7%) will recover and have language skills within normal limits at 4 years of age.5,6 Finally, by 4 years of age, definitions of language impairment are much more specific and should improve the strength of predictive associations.
We were not able to study the impact of childhood illnesses, although we did include perinatal factors (a major component of significant early illnesses) and note that the task force review did not suggest that this was a strong predictor. It is possible that other important risk factors that were not studied here exist; however, no others were obvious either in our own literature review at the inception of this study in 2001 or in the much more recent USPSTF review.7
This comprehensive study indicates that none of the putative risk factors for early language delay that were identified through a major systematic review could be used to predict language outcomes accurately in children at 24 months of age. Although they may more accurately predict language impairment in older children, they seem unlikely to be helpful in screening for early language delay. Two recommendations flow from these findings. First, we believe that language promotion activities in infants who are younger than 24 months should be universal or, if targeted, based on the level of communication skills displayed. Second, additional research should be directed toward defining more tightly the specific components of infant communicative development that most strongly predict language outcomes in the toddler and preschool years.
This study was supported by Australian National Health and Medical Research Council project grant 237106 and small grants obtained from the Murdoch Childrens Research Institute and the Faculty of Health Sciences, La Trobe University. Ethical approval was obtained from the Royal Children's Hospital Melbourne (23018) and La Trobe University (03–32) human ethics committees.
We sincerely acknowledge the contribution of the Victorian Maternal and Child Health nurses who assisted with recruitment of the sample, and we thank all of the participating parents.
- Accepted May 9, 2007.
- Address correspondence to Sheena Reilly, PhD, Speech Pathology Department, Royal Children's Hospital, Flemington Road, Parkville, Victoria 3086, Australia. E-mail:
Drs Reilly, Bavin, and Prior initiated the project; Drs Reilly, Wake, and Eadie and Ms Barrett managed the project, including data collection and analysis; Dr. Ukoumunne provided statistical advice and conducted the analyses; Dr Reilly wrote the article, and all authors contributed to planning, reviewing, and editing the manuscript; and Dr Reilly had full access to all of the data in the study, takes responsibility for the integrity of the data and the accuracy of the data analysis, and is the guarantor.
The authors have indicated they have no financial relationships relevant to this article to disclose.
- ↵Nelson HD, Nygren P, Walker M, Panoscha R. Screening for speech and language delay in preschool children: systematic evidence review for the US Preventive Services Task Force. Pediatrics.2006;117(2) . Available at: www.pediatrics.org/cgi/content/full/117/2/e298
- Cantwell DP, Baker L. Psychiatric and learning disorders in children with speech and language disorders: a descriptive analysis. Adv Learn Behav Disabil.1985;1 :4
- ↵Rice ML. Growth models of developmental language disorders: developmental language disorders. In: Rice ML, Warren SF, eds. Developmental Language Disorders: From Phenotypes to Etiologies. Mahwah, NJ: Lawrence Erlbaum; 2004:207–240
- ↵Australian Bureau of Statistics. Socio-Economic Indexes for Areas. Canberra, Australia: Australian Bureau of Statistics; 2001
- ↵Wetherby A, Prizant B. Communication and Symbolic Behaviour Scales. Baltimore, MD: Paul H. Brookes; 2002
- ↵Fenson L, Dale PS, Reznick JS. The MacArthur Communicative Development Inventories: User's Guide and Technical Manual. San Diego, CA: Singular Publishing Group; 1993
- ↵Kessler R, Mroczek D. Final Version of our Non-Specific Psychological Distress Scale [memorandum]. Ann Arbor, MI: Institute for Social Research; 1994
- ↵Raven JC. Mill Hill Vocabulary Scale. Oxford, United Kingdom: JC Raven Ltd; 1997
- ↵Goldstein H. Multilevel Statistical Models. London, United Kingdom: Arnold; 1995
- ↵Stata Statistical Software [computer program]. Release 7.0. College Station, TX: Stata Corp; 2005
- ↵Hosmer DW, Lemeshow S. Applied Logistic Regression. New York, NY: Wiley; 2000
- ↵Nelson HD, Nygren P, Walker M, Panoscha R. Screening for Speech and Language Delay in Preschool Children: Systematic Evidence Review. Number 41. Rockville, MD: Agency for Healthcare Quality and Research, US Department of Health and Human Services; 2006
- Copyright © 2007 by the American Academy of Pediatrics