pediatrics
August 2004, VOLUME114 /ISSUE 2

A Modified Screening Tool for Autism (Checklist for Autism in Toddlers [CHAT-23]) for Chinese Children

  1. Virginia Wong, FRCP, FHKAM, FHKC(Paed), FRCPCH,
  2. Lai-Hing Stella Hui, MMedSc,
  3. Wing-Cheong Lee, MBBS,
  4. Lok-Sum Joy Leung, MBBS,
  5. Po-Ki Polly Ho, MBBS,
  6. Wai-Ling Christine Lau, MBBS,
  7. Cheuk-Wing Fung, MBBS, MRCP, and
  8. Brian Chung, MBBS(Hons), MRCPHK
  1. From the Department of Paediatrics and Adolescent Medicine, University of Hong Kong, Hong Kong

Abstract

Background. There is a recent trend of a worldwide increase in the incidence of autistic spectrum disorder. Early identification and intervention have proved to be beneficial. The original version of the Checklist for Autism in Toddlers (CHAT) was a simple screening tool for identification of autistic children at 18 months of age in the United Kingdom. Children with an absence of joint attention (including protodeclarative pointing and gaze monitoring) and pretend play at 18 months were at high risk of autism. Section A of the CHAT was a self-administered questionnaire for parents, with 9 yes/no questions addressing the following areas of child development: rough and tumble play, social interest, motor development, social play, pretend play, protoimperative pointing (pointing to ask for something), protodeclarative pointing, functional play, and showing. Section B of the CHAT consisted of 5 items, which were recorded with observation of the children by general practitioners or health visitors. The 5 items addressed the child’s eye contact, ability to follow a point (gaze monitoring), pretend (pretend play), produce a point (protodeclarative pointing), and make a tower of blocks. A 6-year follow-up study of >16 000 children screened with the CHAT at 18 months in the United Kingdom showed a sensitivity of only 0.40 and a specificity of 0.98, with a positive predictive value (PPV) of 0.26. Rescreening using the same instrument at 19 months for those who failed the 18-month screening yielded a higher PPV of 0.75. Therefore, children were likely to have autism if they failed the CHAT at 18 months and failed again at 19 months. It was estimated that consistent failure in 3 key questions (ie, protodeclarative pointing, gaze monitoring, and pretend play) at 18 months indicated an 83.3% risk of having autism. Because of the poor sensitivity of the original CHAT for autism, a Modified Checklist for Autism in Toddlers (M-CHAT), consisting of 23 questions, with 9 questions from the original CHAT and an additional 14 questions addressing core symptoms present among young autistic children, was designed in the United States. The original observational part (ie, section B) was omitted. The M-CHAT was designed as a simple, self-administered, parental questionnaire for use during regular pediatric visits. The more questions children failed, the higher their risk of having autism. Two criteria were used to measure the sensitivity and specificity of M-CHAT. Criterion 1 used any 3 of the 23 questions, and criterion 2 used 2 of the 6 best questions that could be used to discriminate autism from other groups. The sensitivity and specificity for criterion 1 were 0.97 and 0.95 and those for criterion 2 were 0.95 and 0.99, respectively. M-CHAT had a better sensitivity than the original CHAT, because children up to 24 months of age were screened, with the aim of identifying those who might regress between 18 and 24 months. The 6 best questions of the M-CHAT addressed areas of social relatedness (interest in other children and imitation), joint attention (protodeclarative pointing and gaze monitoring), bringing objects to show parents, and responses to calling. Joint attention was addressed in the original CHAT, whereas the other areas were addressed only in the M-CHAT. To date, there has been no study of the application of either the original CHAT or the M-CHAT for Chinese populations.

Objectives. CHAT-23 is a new checklist translated into Chinese, combining the M-CHAT (23 questions) with graded scores and section B (observational section) of the CHAT. We aimed to determine whether CHAT-23 could discriminate autism at mental ages of 18 to 24 months for Chinese children and to determine the best combination of questions to identify autism.

Methods. A cross-sectional cohort study was performed with 212 children with mental ages of 18 to 24 months. The children were categorized into 2 groups, ie, group 1 (N = 87) (autistic disorder: N = 53; pervasive developmental disorder: N = 33) and group 2 (N = 125) (nonautistic). The checklist included self-administered questionnaires with 23 questions (part A) and direct observations of 5 items by trained investigators (part B). We performed discriminant function analysis to determine the key questions that could best discriminate autism from nonautism. The sensitivity and specificity of CHAT-23 were calculated.

Results. We found that 7 key questions, addressing areas of joint attention, pretend play, social relatedness, and social referencing, were identified as discriminative for autism. For part A, failing any 2 of 7 key questions, ie, question 13 (does your child imitate you? [eg, you make a face; will your child imitate it?]), question 5 (does your child ever pretend, for example, to talk on the phone or take care of dolls, or pretend other things?), question 7 (does your child ever use his/her index finger to point, to indicate interest in something?), question 23 (does your child look at your face to check your reaction when faced with something unfamiliar?), question 9 (does your child ever bring objects over to you [parent] to show you something?), question 15 (if you point at a toy across the room, does your child look at it?), and question 2 (does your child take an interest in other children?), yielded sensitivity of 0.931 and specificity of 0.768. Failing any 6 of all 23 questions produced sensitivity of 0.839 and specificity of 0.848. For part B, failing any 2 of 4 items produced sensitivity of 0.736, specificity of 0.912, and PPV of 0.853. The 4 observational items were as follows: item B1: during the appointment, has the child made eye contact with you? item B2: does the child look across to see what you are pointing at? item B3: does the child pretend to pour out tea, drink it, etc?; item B4: does the child point with his/her index finger at the light?

Conclusion. We found that integrating the screening questions of the M-CHAT (from the United States) and observational section B of the original CHAT (from the United Kingdom) yielded high sensitivity and specificity in discriminating autism at 18 to 24 months of age for our Chinese cohort. This new screening instrument (CHAT-23) is simple to administer. We found that a 2-stage screening program for autism can offer a cost-effective method for early detection of autism at 18 to 24 months. For CHAT-23, use of both the parental questionnaire and direct observation and use of the criterion of failing any 2 of 7 key questions yielded the highest sensitivity but a relatively lower specificity, whereas use of part B yielded the highest specificity but a lower sensitivity. We recommend identifying the possible positive cases with part A (parental questionnaire) and then proceeding to part B (observation) with trained assessors. The proposed algorithm for screening for autism is as follows. 1) The parents or chief caretakers complete a 23-item questionnaire when their children are 18 to 24 months of age. 2) The parents mail, fax, or hand this 23-item questionnaire to the local child health agency. 3) Clerical staff members check for and score failure, with the criteria of failing any 2 of 7 key questions or failing any 6 of 23 questions; if either criterion is met, then the staff members highlight the medical records of the suspicious cases. 4) Trained child health care professionals observe the children who failed any 2 of 7 key questions or any 6 of 23 questions. These identified patients are observed for 5 minutes for part B of the CHAT-23. 5) Any child who fails any 2 of 4 items requires direct referral to a comprehensive autism evaluation team, for early diagnostic evaluation and early intervention. The high sensitivity and specificity of the criteria observed in our study suggested that CHAT-23 might be used to differentiate children with autism. Additional international collaboration with the use of the CHAT, M-CHAT, and CHAT-23 could provide more prospective epidemiologic data, to establish whether there is a genuine increase in the worldwide incidence of autism.

  • CHAT
  • M-CHAT
  • CHAT-23
  • autistic spectrum disorder
  • autism
  • pervasive developmental disorder
  • children
  • Chinese
  • screening
  • sensitivity
  • specificity

There has been a recent trend of a worldwide increase in the incidence of autistic spectrum disorders or autism.16 Early identification and intervention have proven to be beneficial.79

The original version of the Checklist for Autism in Toddlers (CHAT) was a simple screening tool for the identification of autistic children at 18 months, in the United Kingdom.10 Baron-Cohen et al11,12 hypothesized that children with an absence of joint attention (including protodeclarative pointing and gaze monitoring) and pretend play at 18 months were at high risk of autism.

Section A of the CHAT is a self-administered questionnaire for parents. It consists of 9 yes/no questions addressing the following areas of child development: rough and tumble play, social interest, motor development, social play, pretend play, protoimperative pointing (pointing to ask for something), protodeclarative pointing, functional play, and showing. Section B of the CHAT consists of 5 items recorded through observation of the child by a general practitioner or health visitor. The 5 items address the child’s eye contact and ability to follow a point (gaze monitoring), pretend (pretend play), produce a point (protodeclarative pointing), and make a tower of blocks.

A 6-year follow-up study in the United Kingdom of >16 000 children screened with the CHAT at 18 months showed a sensitivity of only 0.40 and a specificity of 0.98, with a positive predictive value (PPV) of 0.26.13 Rescreening using the same instrument at 19 months for those who failed the 18-month screening gave a higher PPV of 0.75. Therefore, the authors proposed that children were likely to have autism if they failed the CHAT at 18 months and failed again at 19 months.14 It was estimated that consistent failure in 3 key questions (ie, protodeclarative pointing, gaze monitoring, and pretend play) at 18 months was associated with a 83.3% risk of having autism.11

Because of the poor sensitivity of the original CHAT for autism, Robins et al15 attempted to modify the original CHAT to a form that was questionnaire-based only, ie, the Modified Checklist for Autism in Toddlers (M-CHAT). M-CHAT consisted of 23 questions, with 9 questions from the original CHAT and an additional 14 questions that addressed core symptoms present among young autistic children. The original observational section (ie, section B) was omitted.

The M-CHAT was designed as a simple, self-administered, parental questionnaire for use during regular pediatric visits. It did not rely on physicians’ observations.15 The more questions children failed, the higher their risk of having autism. Subjects identified in the screening were assessed with the Vineland Adaptive Behavior Scales, the Bayley Scales of Infant Development, the Communication and Symbolic Behavior Scale, the Childhood Autism Rating Scale, and a semistructured interview based on Diagnostic and Statistical Manual of Mental Disorders, 4th ed, criteria for autistic disorder.

The authors used 2 criteria to measure the sensitivity and specificity of the M-CHAT.15 Criterion 1 used any 3 of the 23 questions, and criterion 2 used 2 of the 6 best questions that could be used to discriminate the group with autism from other groups. The sensitivity and specificity of criterion 1 were 0.97 and 0.95 and those of criterion 2 were 0.95 and 0.99, respectively. Therefore, M-CHAT had better sensitivity than the original CHAT, because children up to 24 months of age were screened, with the aim of identifying those who might regress between 18 and 24 months of age.

The 6 best questions of the M-CHAT addressed social relatedness (interest in other children and imitation), joint attention (protodeclarative pointing and gaze monitoring), bringing objects to show parents, and responses to calling. Joint attention was addressed also in the original CHAT, whereas the other areas were addressed only in the M-CHAT.

To date, there has been no study of the application of either the original CHAT or the M-CHAT in Chinese populations. The original CHAT or M-CHAT could not be used in Hong Kong, for the following reasons. Firstly, the original CHAT and M-CHAT were both written in English. Secondly, because of cultural differences, parents or children in Hong Kong might not respond in the same ways. Thirdly, studies showed that it was important to use both parental information and direct observation of the child during screening for autism among children.79

This research is the first study in a Chinese population to examine the possibility of devising a new screening instrument by combining the observational part of the original CHAT and the M-CHAT to detect autism among Chinese children. The objectives of this study were to examine the effectiveness of a modified version, integrating the CHAT and the M-CHAT, in discriminating autism cases from nonautism cases and to determine the best combination of questions to identify autism in a Chinese population.

METHODS

Study Design

A cross-sectional study was performed. The sample included 276 children, 13 to 86 months of age, with normal development, developmental delay, or autism spectrum disorder. The subjects were recruited from 5 normal nurseries, 2 integrated child care centers, 7 early education and training centers, and 6 special child care centers under the Social Welfare Department and Education Department, and the neurology clinics of the Duchess of Kent Children’s Hospital in Hong Kong.

Group 1 (Autism)

One hundred twenty-eight children were recruited. All children in group 1 had been diagnosed with autism (N = 54) or pervasive developmental disorders (N = 33) by developmental pediatricians or clinical psychologists. The final diagnoses of autistic disorder or pervasive developmental disorders were made on the basis of clinical interviews and observations, Diagnostic and Statistical Manual of Mental Disorders, 4th ed,16 and the Autism Diagnostic Interview-Revised.17 The developmental age in the first evaluation was assessed with the Griffiths’ Mental Developmental Scale (GMDS)18 or the Symbolic Play Test (SPT)19 (if the children were uncooperative in GMDS assessments). In the present study, we recruited autistic children who were <78 months of age, because we found that it was difficult to recruit autistic children diagnosed as early as 18 to 24 months.

Group 2 (Nonautism)

This group consisted of both normal children and children with developmental delays. Sixty-eight normal children were recruited. They all passed developmental screening tests in the maternal and child health clinics of Hong Kong and had not been suspected by any pediatricians of having developmental delays or autism at the time of study entry.

Eighty children with development delays were recruited. They had been diagnosed as having global delays or language delays by pediatricians, using the GMDS, at the first assessment in child assessment centers.

Calculation of Mental Age

In this study, we used the SPT19 to assess the mental ages of all children during field trips to the special schools, for both groups. This test was administered because, in our past experience assessing autistic children, the children were more cooperative in performing the 4 play items within their short attention spans (<15 minutes); the GMDS required a longer time to complete (∼30–60 minutes), depending on the chronologic age and tolerance of the child. In our pilot study, we administered part B of the CHAT during the same field trip as the assessment of mental age, and we found that children in both groups could tolerate performing the SPT and part B of the CHAT within the same session better than using the GMDS.

To study subjects of the same age as for the original CHAT (18 months) and the M-CHAT (18–24 months), we included only children with mental ages of 18 to 24 months (SPT raw scores: 9–14) in our final data analysis. Because of the difficulty of recruiting autistic children <24 months of age and the limited number of subjects in group 1, we did not calculate the sample size. We attempted to recruit more patients with autism into our study, so that the discriminating power of the new instrument would be better. The final case (autism)/control (nonautism) ratio was 1:1.5 in this study, and the power was reasonable. Because the power was expected to level off when the case/control ratio was 1:2, we did not include more control subjects.

Inclusion Criteria

Only Chinese children with mental ages (assessed with the SPT) of 18 to 24 months were included in the final analysis.

Exclusion Criteria

Children with mental ages of <18 months (SPT raw score: <9) or >24 months (SPT raw score: >14) were excluded. Children with active medical conditions such as epilepsy and those receiving any anticonvulsant were excluded.

CHAT-23

To avoid bias among parents with autistic children who would look at the questionnaire, we changed the title of the questionnaire to “Behavioral and Communication Checklist for Children.” Therefore, although CHAT-23 was administered to parents after the diagnosis of autism had been made, the parents were blinded to the objective of this questionnaire.

CHAT-23 combined 23 questions of the M-CHAT (parental questionnaire) and section B (observational checklist) of the original CHAT. CHAT-23 was a Chinese translation of the M-CHAT and section B of the original CHAT. Both sections were translated into traditional Chinese for the convenience of primary health care workers in the Chinese-speaking community, whether Mandarin or Cantonese, although Chinese script writing is in the traditional Mandarin style. The translations of CHAT and M-CHAT were performed with written authorization from the original authors.

We also administered CHAT-23 to nonautistic children before the field trip, to validate the 23 questions in part A and the 5 items in part B. Before final endorsement of this project, back-translation of CHAT-23 was performed, for caretakers and children who could understand written and spoken English.

CHAT-23 Part A

We also performed a pilot study with the translated Chinese CHAT-23 before data collection. We found that parents often could not give definite answers to simple yes/no questions; they reported back to us that their children’s symptoms were “occasional.” Therefore, we added a grading score to the responses, instead of a simple yes/no, for 22 questions of the M-CHAT (all except question 16, ie, “Does your child walk?”). The graded responses were changed to never, seldom, usually, or often, with semiquantification as 0%, <25%, 25% to 50%, and >50%, respectively.

For the final version of CHAT-23 part A, a grading score was added to the responses, instead of yes/no, except for question 16. The responses were changed to never, seldom, usually, or often. The scoring system for part B of CHAT-23 was also modified for easy scoring. Assessors were required to follow a set of standardized procedures in part B of CHAT-23, for more objective and reliable scoring.

CHAT-23 Part B

Section B of the original CHAT formed part B of CHAT-23. This was a 5-item observational checklist assessed by trained assessors. Four items were eye contact (B1), gaze monitoring (B2), pretend play (B3), and protodeclarative pointing (B4). The fifth item was an assessment of the mental ability of the children, in which the children were asked to build with blocks. We used the same grading system as in the scoring scales. The graded responses were changed to never, seldom, usually, or often, with semiquantification as 0%, <25%, 25% to 50%, and >50%, respectively.

Administration of CHAT-23 required ∼10 minutes, on average. We obtained information from parents (by asking them to mark the boxes for the 23 questions, <5 minutes) and by direct observation (5 minutes for part B).

Failing Criteria in CHAT-23

Part A (23 Questions)

We defined failing any of the 23 questions in part A of the CHAT as follows: answers of never or seldom were considered failing, whereas answers of usually or often were considered passing, except for questions 11, 18, 20, and 22, for which answers of usually or often represented failures and answers of never or seldom represented passes. For question 16, no was considered failing and yes was considered passing.

Part B (5 Items)

Item 1

We defined failing as follows: for eye contact, never or seldom was considered failing, whereas usually or often was considered passing.

Item 2

For gaze monitoring, yes was considered passing and no was considered failing.

Item 3

For pretend play, passing was considered only when the child could play without imitation; imitation after demonstration by the interviewer or no pretend play was considered failing.

Item 4

For protodeclarative pointing, a pass was warranted only when the child could point and look at the object. All other responses (look only, point only, or never) were regarded as failing.

Inter-rater Reliability

Part B of CHAT-23 was administered by 12 interviewers (11 medical students attending a special study module in the Department of Pediatrics and Adolescent Medicine of the Faculty of Medicine, University of Hong Kong, in June 2002, and the second author). Before collection, the interviewers were tested for inter-rater reliability. Normal and autistic children were videotaped, with parental consent. All interviewers assessed the cases, and the inter-rater reliability was determined by using weighted κ values. The inter-rater reliability for scoring CHAT-23 (part B) was 0.95. Interviewers were blinded with respect to the diagnoses for the subjects at the time of administration of CHAT-23, although a greater chance of meeting children with autism or delayed developmental in special nurseries might be expected.

Statistical Analyses

SPSS software (SPSS, Inc, Chicago, IL) was used for analysis. Two-tailed z tests were used to compare 2 proportions, and χ2 tests were used to determine the association between 2 categorical variables.

Discriminant function analysis20 was used to identify the key questions that could be used to differentiate between autistic and nonautistic children. The sensitivity, specificity, and PPV of CHAT-23 were calculated. Receiver operating characteristic (ROC) curves21 were used to calculate the number of key questions that could best differentiate between the autism and nonautism groups.

Informed Consent and Ethics Committee Approval

The study was approved by the institutional review board of the Faculty of Medicine, University of Hong Kong. Parental consent was obtained before interviews.

RESULTS

Subjects

The final analysis included 212 children. Sixty-four children were excluded, ie, 58 children with mental ages (assessed with the SPT) of <18 months or >24 months and 6 with epilepsy. The distribution of these 64 excluded patients was as follows: 10 from the normal group, 13 from the delayed development group, and 41 from the autism group.

Group 1 (autism) consisted of 87 children (77 boys and 10 girls), with a mean age of 51.3 ± 12.1 months (range: 26–86 months). Group 2 (nonautism) consisted of 125 children (79 boys and 46 girls) with delayed development (N = 67) or normal development (N = 58). The mean age for group 2 was 29.1 ± 7.8 months (range: 18–52 months). The mean age for the group with developmental delays was 33.5.1 ± 7.8 months (range: 16–52 months). The mean age for the normal group was 23.9 ± 3.9 months (range: 16–33 months). We included the 2 children with chronologic ages of 16 months because their mental ages were determined (with the SPT) to be 18 months.

Item Analysis of All 23 Questions

Two-tailed z tests showed that, except for questions 1 and 16, all questions were able to distinguish between group 1 and group 2 (P < .001) (Table 1).

TABLE 1.

Item Analysis, to Discriminate the Usefulness of 23 Questions to Differentiate Between Autism and Nonautism Groups, Using 2-Tailed z Test

Identification of Key Questions

Discriminant function analysis was performed for part A of CHAT-23, to determine the order of questions that was useful in differentiating between the autism and nonautism groups. Table 2 presents the standardized canonical discriminant function coefficients and the percentages of subjects who failed the 23 questions. Seven key questions were identified with these criteria, as follows (in descending order): question 13 (does your child imitate you? [eg, you make a face; will your child imitate it?]), question 5 (does your child ever pretend, for example, to talk on the phone or take care of dolls, or pretend other things?), question 7 (does your child ever use his/her index finger to point, to indicate interest in something?), question 23 (does your child look at your face to check your reaction when faced with something unfamiliar?), question 9 (does your child ever bring objects over to you [parent] to show you something?), question 15 (if you point at a toy across the room, does your child look at it?), and question 2 (does your child take an interest in other children?).

TABLE 2.

Standardized Canonical Discriminant Function Coefficients of 23 Questions to Test for the Ranking of Key Questions for Differentiating the Groups

Determination of Cutoff Criteria for Differentiation of Autism Versus Nonautism

Criterion 1, Using the 7 Key Questions

The 7 key questions identified were as follows (in descending order): question 13 (does your child imitate you? [eg, you make a face; will your child imitate it?]), question 5 (does your child ever pretend, for example, to talk on the phone or take care of dolls, or pretend other things?), question 7 (does your child ever use his/her index finger to point, to indicate interest in something?), question 23 (does your child look at your face to check your reaction when faced with something unfamiliar?), question 9 (does your child ever bring objects over to you [parent] to show you something?), question 15 (if you point at a toy across the room, does your child look at it?), and question 2 (does your child take an interest in other children?) (Tables 3 and 4).

TABLE 3.

Sensitivity and Specificity of Using Any Combination of the 7 Key Questions in Part A (Questions 13, 5, 7, 23, 9, 15, and 2) for Discrimination

TABLE 4.

Sensitivity and Specificity of Using Any Combination of 7 Key Questions in Part A for Differentiating the Groups

The 7 questions were then subjected to ROC analysis, to determine the optimal cutoff point. The ranking of the questions was considered (ie, the question with the lowest weighting in the list was deleted each time). Sensitivity, specificity, and PPV were calculated (Table 3).

The sensitivities obtained with this analysis were low. When question 13 was used as the discriminating question, only 70.1% of autistic children failed the question. If questions 13 and 5 were used, the sensitivity decreased to 0.460, which was not acceptable. Therefore, this criterion could not be used to detect autism.

We then used another method to obtain higher sensitivity. We discarded the ranking of questions, eg, if 28 children in group 1 failed at least 6 of 7 questions (including failing 6 or 7 questions), then the sensitivity was 0.322 (Table 4). The sensitivity, specificity, and PPV of failing any 2 or 3 of 7 key questions were reasonable (Table 4).

We then used a ROC curve to determine the best discriminating criteria for the 7 key questions. Figure 1 presents the ROC curve for selection of the optimal cutoff point. Using the ROC curve, we found that the area under the curve was 0.917 (95% confidence interval: 0.878-0.957). The optimal cutoff point for differentiating between autism and nonautism was failing any 2 questions of 7 key questions, with a sensitivity of 0.931 and a specificity 0.768.

Fig 1.

ROC curve for selection of the optimal cutoff point with the 7 key questions. The ROC curve displays the relationship between the true-positive rate (sensitivity) and the false-positive rate (1 − specificity) for the different possible cutoff points for a screening or diagnostic tool.21 The best cutoff point lies closest to the top left corner of the ROC space, because this represents optimal sensitivity and specificity. The area under the ROC curve represents the probability of correctly distinguishing between the 2 groups (the arrow represents the best result). An area under the curve of >0.9 represents excellent differentiating power of the screening tool.

Criterion 2, Using All 23 Questions

Using the criterion for the 7 key questions, we found that the specificity was only 0.768. We then attempted to use all 23 questions in part A, to determine whether there was an optimal cutoff point that could be used as the differentiating criterion for greater specificity. The sensitivity, specificity, and PPV were calculated (Table 5). Because no subject failed all 23 questions, any 22 questions, or any 21 questions in the parental questionnaire, calculations were not performed for these criteria.

TABLE 5.

Sensitivity and Specificity of Using Any Combination of 23 Questions in Part A for Differentiating the Groups

A ROC curve was plotted to determine the optimal cutoff point that could distinguish best between group 1 and group 2. The sensitivity, specificity, and PPV of failing 7, 6, or 5 of 23 questions were reasonably high. A ROC curve was plotted to determine the best discriminating criterion (Fig 2). The area under the ROC curve was 0.905 (95% confidence interval: 0.864–0.947). The optimal cutoff point was failing any 6 of 23 questions in part A of CHAT-23.

Fig 2.

ROC curve for selection of the optimal cutoff point with the 23 questions.

Part B of CHAT-23

Part B of CHAT-23 was an observational checklist. Item B5 (block building) was a test of the mental ability of the subject and was not used in the statistical tests. Two-tailed z tests were used to test for the significance of differences in the percentages of the 2 groups failing on 4 items. We found that all 4 items (item B1: during the appointment, has the child made eye contact with you? item B2: does the child look across to see what you are pointing at? item B3: does the child pretend to pour out tea, drink it, etc? item B4: does the child point with his/her index finger at the light?) could distinguish between autism and nonautism groups (Tables 6 and 7).

TABLE 6.

Percentage of Subjects Who Failed Each of the 4 Items in Part B of CHAT-23

TABLE 7.

Selection of the Minimal Number of Items in CHAT-23 (Part B) for Sensitivity and Specificity

Tables 7 and 8 present the sensitivity and specificity of failing any of the 4 items in part B of CHAT-23. Failing any 2 of 4 items in part B produced a sensitivity of 0.736, specificity of 0.912, and PPV of 0.853.

TABLE 8.

Summary of Comparison of Sensitivity, Specificity, and PPV of Using 3 Discriminating Criteria of CHAT-23

APPENDIX 1.

English Version of the CHAT-23

DISCUSSION

Screening Instrument and Algorithm

Our study was the first to determine whether a modified screening instrument (ie, CHAT-23) could be used to distinguish children with autism at 18 to 24 months of age, in the Chinese community. We found that, by combining the 23 questions in the M-CHAT with section B (observational part) of the original CHAT, we could screen for autism in a cohort study of children with mental ages of 18 to 24 months.22,23

This new screening instrument (CHAT-23) is simple to administer. The proposed algorithm for screening for autism is as follows. 1) The parents or chief caretakers complete a 23-item questionnaire when their children are 18 to 24 months of age. 2) The parents mail, fax, or hand this 23-item questionnaire to the local child health agency. 3) Clerical staff members check for and score failure, with the criteria of failing any 2 of 7 key questions or failing any 6 of 23 questions; if either criterion is met, then the staff members highlight the medical records of the suspicious cases. 4) Trained child health care professionals observe the children who failed any 2 of 7 key questions or any 6 of 23 questions. These identified patients are observed for 5 minutes for part B of the CHAT-23. 5) Any child who fails any 2 of 4 items requires direct referral to a comprehensive autism evaluation team, for early diagnostic evaluation and early intervention. With this algorithm, we hope that most autistic children can be identified by 24 months of age.

We carefully identified the criteria for early identification of autism (Table 8). Criterion 1 was failing any 2 of the 7 key questions, and criterion 2 was failing any 6 of all 23 questions in part A of CHAT-23. The sensitivity of criterion 1 was greater than that of criterion 2, although the specificity and PPV of criterion 1 were not as great as those of criterion 2. A screening tool should be able to identify as many true positives as possible and to reduce the number of false negatives. Therefore, we propose that the use of 7 key questions would be simpler and more convenient than use of the full set of 23 questions, allowing the scorer to make a rapid determination of whether to proceed with observational part B. These children could be called back for a session and, with the ease of administration of section B of the original CHAT, approximately 30 to 40 children could be evaluated in a 3-hour session; this is definitely cost-effective for a successful screening program for autism.

Seven Key Questions

These areas of development represent the core deficits among children with autism. Children >18 months of age should be able to pass these 7 key questions. The 7 key questions identified were related to the following areas of normal child development: 1) social relatedness: question 13 (imitation) (does your child imitate you? [eg, you make a face; will your child imitate it?]) and question 2 (interest in other children) (does your child take an interest in other children?); 2) pretend play: question 5 (pretend play) (does your child ever pretend, for example, to talk on the phone or take care of dolls, or pretend other things?); 3) joint attention: question 7 (protodeclarative pointing) (does your child ever use his/her index finger to point, to indicate interest in something?), question 9 (bringing objects to show parents) (does your child ever bring objects over to you [parent] to show you something?), and question 15 (following a point) (if you point at a toy across the room, does your child look at it?); 4) social referencing: question 23 (checking others’ reaction) (does your child look at your face to check your reaction when faced with something unfamiliar?).

If the questionnaire showed that a child failed any 2 of the 7 key questions (questions 13, 5, 7, 23, 9, 15, and 2), then the child should be suspected of having autism and additional clinic observation of the 4 items in part B should be conducted. If the child failed any 2 of the 4 items in part B, then confirmatory assessments by developmental pediatricians, clinical psychologists, or speech therapists for autism should be performed.

The 7 key discriminating questions (questions 13, 5, 7, 23, 9, 15, and 2) that we identified were the same as the discriminating questions in the original CHAT and the M-CHAT, except for question 23 (Appendix 1). Question 23 in the M-CHAT addresses the area of social referencing (does your child look at your face to check your reaction when faced with something unfamiliar?). This question was not available in the original CHAT (with only the first 9 questions) and it was not analyzed in the M-CHAT because not all of the subjects answered this question. In our analysis, we found that social referencing was an important area in which most children with autism failed. We propose that question 23 is a very important question that should be retested if the M-CHAT tool is used by other researchers.

Part B of CHAT-23

Part B of CHAT-23 is an observational checklist with which trained assessors can evaluate the actual behavior of the child. There were 4 discriminating items (item B1: during the appointment, has the child made eye contact with you? item B2: does the child look across to see what you are pointing at? item B3: does the child pretend to pour out tea, drink it, etc? item B4: does the child point with his/her index finger at the light?), which tested 4 different areas of child development, ie, eye contact, gaze monitoring, pretend play, and protodeclarative pointing.

The percentages of failure on the 4 items were significantly higher in group 1 (autism) than group 2 (nonautism). Although, as seen in Table 6, only 26.4% of children in group 1 failed in the assessment of pretend play (does the child pretend to pour out tea, drink it, etc?), younger autistic children who did not receive any training in special nurseries also failed this question. Therefore, when this observational question is used as a screening tool for children ∼18 months of age in the population, it should be able to identify children with autism. The failure rate for item B3 (pretend play) was lower (26.4%) than that for item B1 (74.7%) or item B4 (80.5%). We found in our field trial that older autistic children had been well trained by the special child care staff to use the miniature tea set to pour a cup of tea; therefore, the important observational item B4 might have been underestimated, because younger autistic children who were not trained in this symbolic skill usually failed item B3.

The sensitivity and specificity of using the criterion of failing any 2 of 4 items were 0.736 and 0.912, respectively (Table 7). The PPV was also reasonably high. Therefore, failing any 2 items in part B (except item B5) would indicate possible autism. We should emphasize that this observational part of the CHAT should be performed by trained staff members who have experience with autistic children, to achieve a higher detection rate.

Cultural Differences

The medical screening system in the United Kingdom is different from that in the United States. In Hong Kong, we have adopted a United Kingdom-like medical system. However, in the United Kingdom, health visitors make home visits as part of the screening program for children. In Hong Kong, the majority (99%) of children undergo physical and developmental screening in maternal child health centers when they receive vaccinations, at 3, 6, 9, 12, 18, and 60 months.24 To screen for autism, it would be easier if the mothers were given this questionnaire before the 18-month follow-up visit; children who are identified by failing any 1 of the criterion described above should undergo direct observational assessment with part B of CHAT-23, in the second-stage screening procedure. We think this is a feasible and very cost-effective method, because trained nurses usually treat ∼50 children in a 3-hour morning or afternoon session locally.

Whether this approach is feasible in the United States depends on the local medical system and screening procedures. Although Robins et al15 proposed that 23 questions in the M-CHAT had high sensitivity, we found that parents usually were not reliable in completing checklist forms of questionnaires; therefore, for any culture, particularly because there is a general increase in the incidence of autism throughout the world, the most cost-effective way for those in developing countries to detect autism at earlier ages would be to use both part A and part B, as in the United Kingdom model. Integration of the United Kingdom (CHAT) and United States (M-CHAT) forms would be best for early detection with a 2-stage screening process.

Limitations

Our study might increase the chances of identifying autism, because 41% of our cohort was autistic children. The PPV was high in this study. However, the generalizability of the results to the general population is still unknown, because the PPV varies with the prevalence of disease. A lower prevalence in the general population would yield a lower PPV. A low PPV creates more false-positive results. More resources are used, because both true positives and false positives must be evaluated. However, retesting, at a different time, of children who screened positive with CHAT-23 can increase the PPV, with some false-positive results being eliminated. Specificity also increases, although the sensitivity is lower.

The age of the autistic subjects was higher than that of subjects in the nonautistic group. This was because autism was often diagnosed after 36 months of age in Hong Kong. There were not many autistic children with chronologic ages of ∼24 months. Therefore, we used a stringent criterion of mental age of 18 to 24 months as an inclusion criterion for children from both groups. This range represented the time at which most autistic children exhibited their autistic symptoms. It was difficult to recruit autistic children within the age range of 18 to 24 months. Our pilot autism database of 800 cases collected in the past 20 years showed that autism was often diagnosed at 3 years of age.25,26

CONCLUSIONS

We found that a 2-stage screening program for autism could offer a cost-effective method for early detection of autism at 18 to 24 months of age. For CHAT-23, use of both parental questionnaires and direct observation and use of the criterion of failing any 2 of 7 key questions yielded the highest sensitivity but a relatively lower specificity, whereas use of part B yielded the highest specificity but a lower sensitivity. We therefore recommend that it is better to identify possible positive cases with part A (parental questionnaire) and then proceed to part B (observation) with trained assessors.

Our study suggested that the CHAT-23 is able to distinguish between children with and without autism, with mental ages of 18 to 24 months. Either part A or part B could be used to distinguish between children with autism and those without autism. The parental questionnaire would be the first choice for screening, because it is simple and easy to administer. Two criteria were identified. Children failing any 2 of 7 key discriminating questions or any 6 of 23 questions in the parental questionnaire should be suspected of having autism. Direct observation by trained primary health care workers should be used as part of the screening process. Failing any 2 of 4 items in part B of the CHAT-23 should suggest autism.

The high sensitivity and specificity of the criteria noted in our study suggested that CHAT-23 may be used to identify children with autism. Epidemiologic studies are underway in other parts of China, to validate this new screening tool for autism among Chinese children.

Acknowledgments

We thank the following groups: medical students at the University of Hong Kong: Trevor Yick-Cheung Au-Yeung, Elaine Yuen-Ling Au, Mandy Man-Yee Chu, Joseph Chun-Kit Chung, Ping-Keung Law, Maggie Kam-Ming Ma, Pui-Yi Siu, Winnie Wing-Man Sy, and Sunny Kin-Sun Tse; Department of Pediatrics and Adolescent Medicine, Information Technology Department (Wilfred Wong); normal nurseries: Cheerland Day Nursery and Day Centre, HKSPC Chan Kwan Biu Memorial Foundation Day Centre, Esther Lee Day Centre, HKSPC Portland Street Day Centre, SKH St Thomas Day Center, and St James Settlement Kathleen McDonall Child Care Centre; Heep Hong Society, Hong Kong (Nancy Tsang, Peter Au-Yeung, Monica Yau, and Rachel Leung); early education and training centers: Jessie and Thomas Tam Centre, Jockey Club Centre, Kwok Yip Lin Houn Centre, Leung King Centre, Pak Tin Centre, and Shun Lee Centre; special child care centers: Catherine Lo Centre, Mary Wong Centre, Shui Pin Wai Centre, Tin Ping Centre, Wan Chai Centre, and Wan Tsui Centre. We also thank the parents and children for their full support.

Footnotes

    • Received March 11, 2004.
    • Accepted March 12, 2004.
  • Address correspondence to Virginia Wong, FRCP, FHKAM, FHKC(Paed), FRCPCH, Division of Neurodevelopmental Paediatrics, Department of Paediatrics and Adolescent Medicine, University of Hong Kong, Hong Kong. E-mail: vcnwong{at}hkucc.hku.hk
  • This work was presented at the following meetings: Joint Congress of the 9th International Child Neurology Congress and the 7th Asian and Oceanian Congress of Child Neurology, Satellite Symposium on Autism/Neuromuscular Disorders (September 18–19, 2002, Hong Kong), and Child Neurology Symposium (January 17, 2004, Macau).

CHAT, Checklist for Autism in Toddlers, M-CHAT, Modified Checklist for Autism in Toddlers, GMDS, Griffiths’ Mental Developmental Scale, ROC, receiver operating characteristic, SPT, Symbolic Play Test, PPV, positive predictive value

REFERENCES