PEDIATRICS Vol. 119 No. 6 June 2007, pp. 1083-1088 (doi:10.1542/peds.2006-2330)
ARTICLE |
Statistical Literacy for Readers of Pediatrics: A Moving Target
a Department of Pediatrics
b Department of Public Health Sciences, University of Virginia, Charlottesville, Virginia
| ABSTRACT |
|---|
|
|
|---|
OBJECTIVE. Pediatric residents are expected to study research design and statistical methods to enable them to critically appraise the pediatric literature and apply the findings to patient care. However, it is not clear how best to teach these skills or even which statistical concepts are most important. An earlier study demonstrated that the statistical complexity of articles published in Pediatrics increased from 1952 to 1982. The goals of our study were to assess whether this trend has continued and to determine the statistical measures and procedures most commonly encountered in Pediatrics.
METHODS. We reviewed the print research articles published in Pediatrics, volume 115, 2005, and recorded the statistical measures and procedures reported in each article to determine how many articles used statistics or statistical procedures and what statistical procedures were encountered most commonly.
RESULTS. The proportion of articles that used any inferential statistics increased from 48% in 1982 to 89% in 2005. The mean number of inferential procedures per article increased from 2.5 in 1982 to 3.9 in 2005. The most commonly encountered statistical procedures or measures were descriptive statistics, tests of proportions, measures of risk, logistic regression, t tests, nonparametric tests, analysis of variance, multiple linear regression, sample size and power calculation, and tests of correlation. However, a reader who is familiar with only these concepts can understand the analyses used in only 47% of articles.
CONCLUSIONS. Our results confirm a trend toward the use of new and increasingly complex statistical techniques in Pediatrics. Educational efforts might most profitably focus on the principles underlying statistical analysis rather than on specific statistical tests. Authors, reviewers, and journal editors have a greater responsibility for ensuring that statistical procedures are used appropriately, as it may be increasingly unrealistic to expect readers to fully understand the statistical analyses used in journal articles.
Key Words: medical education statistics publishing
Lifelong learning as a physician demands facility in the assessment and application of clinical evidence from the medical literature. Appraisal of an article's methodologic rigor depends on an understanding of the study design and analysis used by the authors. The importance of these skills in medical training is broadly recognized. The Accreditation Council for Graduate Medical Education and the American Board of Pediatrics mandate that, to attain competency in practice-based learning and improvement, residents are expected to "apply knowledge of study designs and statistical methods to the appraisal of clinical studies" and to "appraise and assimilate evidence from scientific studies related to their patients' health problems."1,2 Optimally, graduates of pediatric residency programs will have a strong enough working knowledge of statistics to be able to evaluate the most important analyses in most of the medical literature they read. Unfortunately, graduates of pediatric residency programs nationwide often report that they receive little to no formal training in epidemiology and biostatistics, and they give only "fair to poor" marks for their knowledge of research design and statistical analysis.3
It is not clear how best to teach critical appraisal skills during pediatric residency training. Study design and biostatistics are often taught in the context of a journal club or an evidence-based medicine conference,4 but the success of such efforts in imparting these concepts has not been well evaluated.5 Time is limited, especially with new resident duty-hour restrictions, and the needs of the learners may vary widely. Medical students frequently have poor skills in basic mathematics, and they often have difficulty interpreting medical data.6 Residents, researchers, and practicing physicians may perform no better.7,8
It is also not clear which statistical concepts are most necessary and useful for readers to become familiar with. There has been a well-documented trend toward the use of new and increasingly complex statistical techniques in published articles.9 Use of these more sophisticated techniques can potentially allow more thorough analysis of study data by, for example, enabling complex modeling with multiple comparisons or multiple variables. But, such advances have made it more and more difficult for readers to understand the study analyses. An earlier study demonstrated that a reader of Pediatrics who understood descriptive statistics (for example, means and standard deviations) and 3 inferential statistical procedures (Student's t test,
2, and Pearson's r) could understand the statistical analysis in 97% of research articles published in 1952, but only 49% of articles in 1982.10 The goals of this study were to determine (1) whether this trend has continued and the proportion of articles that a reader can understand with only these few basic concepts has declined further, and (2) the statistical measures and procedures most commonly encountered in Pediatrics. These concepts could then potentially be used in planning a curriculum for pediatric residents and other readers wishing to improve their skills in critical appraisal of published research.
| METHODS |
|---|
|
|
|---|
We reviewed the 171 print articles published in the Articles, Special Articles, and Review Articles sections of Pediatrics, volume 115 (January to June), 2005. To allow comparison with the results of the previous study, a single volume of Pediatrics was reviewed, exclusive of articles published online only. It was expected that exclusion of the online-only articles would not bias our results, because articles published online only are subject to the same peer-review process and selection criteria as the print Pediatrics.11 Volume 115 was chosen as the most recent complete volume at the time of initiation of our review, without any foreknowledge of the statistical content of the articles in this volume. Two of us (Drs Hellems and Hayden) jointly read the articles and recorded the statistical measures and procedures used in each article. Statistical measures and procedures were usually reported in the text of the methods and/or results sections, but were not infrequently found only in the tables, figures, or elsewhere. A statistician (Dr Gurka) helped to classify procedures with which the reviewers were unfamiliar. The tabulated measures and procedures were then categorized to determine (1) how many articles used statistics or statistical procedures, (2) how many articles had a statistical analysis that could be evaluated by a reader with an understanding only of descriptive statistics and 3 inferential statistical procedures (Student's t test,
2, and Pearson's r), and (3) which statistical procedures were encountered most frequently. The 3 inferential procedures chosen for the second analysis do not necessarily represent the most important statistics but were selected to allow comparisons with the previous study. | RESULTS |
|---|
|
|
|---|
A total of 171 articles were reviewed (Table 1). Only 1 article, a review published as a Special Article, used no statistics or statistical procedures. The proportion of articles that used only descriptive statistics declined from 23% in 1982 to 10% in 2005. Only 18% of articles in 2005 used no statistics or only descriptive statistics, Student's t test,
2, and/or Pearson's r, compared with 65% in 1982. If Review Articles and Special Articles are excluded from consideration, 9% of articles use only descriptive statistics, Student's t test,
2, and/or Pearson's r. The proportion of articles that used any inferential statistical procedure, with or without descriptive statistics, increased from 48% in 1982 to 89% in 2005. The mean number of inferential procedures per article increased from 2.5 in 1982 to 3.9 in 2005.
|
Table 2 lists the statistical measures or procedures encountered in
10% of the articles reviewed. A reader who understands all of these "top 10" topics can potentially understand the analyses used in only 47% of the 171 articles. Table 3 lists the procedures or measures that were encountered in 5% to 9% of articles. A reader who understands the concepts in Tables 2 and 3 can potentially understand the analyses used in only 70% of the 171 articles. Many other statistical techniques were encountered (Table 4).
|
|
|
Multivariable modeling techniques were encountered frequently. The most commonly used multivariable technique was logistic regression. Simple linear regression was not commonly used; in most instances, it was used to assess the impact of single independent variables before conducting the main multivariable analysis. When modeling was used, methods such as variable selection, imputation, model validation, and techniques for adjustment and standardization were often reported.
Nonparametric techniques were used in 24% of articles, most commonly the Mann-Whitney U test and the Kruskal-Wallis test. To add to potential confusion, multiple names were used for the Mann-Whitney U test, including Wilcoxon rank sum, Mann-Whitney U, Mann-Whitney rank sum, and Wilcoxon Mann-Whitney test for ordered categories.
Statistical methods were not always explained or even mentioned in the methods section of the articles, but were often buried in the text of the results section or listed only as footnotes to tables. In several instances, no statistical procedure was specified, but the presence of a p value indicated that a test had been performed. In most of these cases, it was possible to make an educated guess about what sort of procedure had been performed (eg, a test to compare proportions), but it was not possible to determine which specific test had been used (eg,
2 or Fisher's exact test). In these cases, only the more general classification was tabulated.
| DISCUSSION |
|---|
|
|
|---|
This analysis of articles recently published in Pediatrics documents that the use and complexity of statistical analysis has increased over the past 50 years. The 1983 study concluded, "To understand the statistical aspects of current articles requires familiarity with a broader range of sophisticated statistical techniques than was necessary just a few years ago."10 Our study documents that this trend has continued, and that the challenge of understanding the statistical analyses in published articles has only become greater. Virtually all articles now include some statistical measures or techniques, and most use inferential statistical procedures. The number of statistical measures and procedures used in articles has increased; a reader familiar only with the 10 most common statistical concepts will encounter unfamiliar procedures in 53% of articles in Pediatrics. The breadth of techniques found in this 1 volume of Pediatrics is remarkable. (Parenthetically, each of this study's authors, 1 of whom is a biostatistician, encountered tests with which he or she was unfamiliar, and an anonymous reviewer commented, "I have never heard of some of these and I teach this stuff!") In addition, the types of analysis are changing. For example, although many biostatistics courses teach linear regression, logistic regression is much more commonly encountered in Pediatrics. This increasing complexity and "moving target" complicate efforts to determine a workable statistics curriculum for residents and practicing pediatricians.
Reasons for this increase in statistical complexity may include the development of new study designs and statistical techniques, and also the broad availability of expanded computing power.12 Perhaps this increased complexity of statistical analysis should be expected given the increasing complexity of the world in general, and of scientific domains in particular. Taken in this larger context, the statistical complexity is perhaps better understood, but it nevertheless may remain troublesome and baffling to readers.
A reader may understand a research article at several levels. He or she may understand the statistical tests and procedures well enough to assess whether they were appropriate to the study and conducted correctly, or he or she may be able to interpret results reported as descriptive statistics, measures of effect size, or P values without understanding the statistical procedures used. The latter reader may still find an article to be valuable.
If, however, one assumes that a general reader should be able to understand the statistical procedures and measures in most published articles, there are a few possible courses of action. One option might be for journals to require that statistical methods be kept relatively simple and that any unusual or complex procedures be explained thoroughly. In this context, "unusual" could be defined mathematically, for example, as a test appearing in <5% of articles. Such a requirement might, however, "dumb down" the techniques used, result in suboptimal analysis of study data, and increase the length of methods section that few would ever read, let alone comprehend.
The optimal way to report statistical methods no doubt depends on the article's anticipated audience. Unfortunately there may be many audiences (or a continuum of audiences) based on readers' levels of interest in the clinical topic and expertise in research design and analysis. For example, clinicians may have better understood the results of 1 reviewed study because it included helpful background information about the statistical model, as follows:
The Cox regression technique takes account of variable length of follow-up monitoring, including the possibility of "censoring" (no event when last observed but future events are not ruled out), and produces an estimate of the relative likelihood of the event during any small time interval ("hazard ratio"), as affected by specified risk factors. Like the conventional techniques of multiple linear and logistic regression, Cox regression can assess the independent effect of each risk factor while controlling simultaneously for other factors.13
This same information, however, may have been boring and superfluous for a reader with substantial statistical expertise. In contrast, statistically savvy readers may appreciate having substantial detail of a mathematical model, whereas most clinicians are unlikely to delve into a discussion of 6 different methods used to impute missing study data included in another reviewed study.14 Perhaps ideally, articles will include a brief overview of the statistical methods used, as well as significant detail (perhaps in an appendix) for statistical reviewers and any interested readers. In the instance of printed articles, additional information can be made available on request. For articles published electronically, readers who desire more information about the statistical technique or model could perhaps click on a link to access that material.
A second option would be to provide readers with more intensive training in statistical methods. Given current duty-hour restrictions for residents, however, finding more time to teach this material during residency will be difficult. Likewise, educational sessions on biostatistics at continuing medical education meetings are not likely to attract large audiences if they are competing with clinical updates or sessions on such practical issues as new vaccines or office management. Placing greater emphasis on teaching biostatistics to medical students is a possibility, but the practical value of this information may be less clear and, therefore, less interesting to students at this earlier stage of training.
A third option is to concede that many readers will never be motivated and/or able to understand the statistical analysis of most published articles. In past years when the variety of statistical techniques encountered was narrower than today, motivated physician readers could develop a rudimentary understanding of the techniques they were likely to encounter in published articles. Now that the range of techniques encountered has broadened so widely, the expectations may need to change. The purely statistical aspects of biomedical research are certainly not as important and as crucial to good science as is sound research design with attention to potential sources of bias, choice of appropriate controls, and types of outcomes chosen. Educational efforts focusing on principles of study design and potential biases might aid the clinician reader regardless of complexity of statistical analysis. A complementary approach is for clinicians to become "information masters" who efficiently use the medical literature, including secondary sources such as the Cochrane Database of Systematic Reviews, as well as assessments of the strength of research evidence, such as the strength of recommendation taxonomy.15,16 For most readers, understanding the "what" and "why" of the research is more important than understanding the "how" of the analysis.
Readers who do not understand the statistical measures and analysis used in an article have several options. Because ignorance often breeds mistrust, readers may tend to reject an unfamiliar analysis and discount an article's results, but this might well result in dismissing an important research finding. Consulting a statistician for assistance may be helpful, but this is impractical for most readers. Reading an expert review of the article may be helpful, if one has been written. Trusting a study's authors and the journal's peer-review process to assure that the statistical analysis is appropriate and correct is another possibility, but journal editors may not conduct statistical reviews of submitted manuscripts,17 and statistical errors have been detected commonly in published articles.18–20
Including a biostatistician among the authors of an article probably increases the possibility that an "unfamiliar" statistical test is used, but may well also increase the likelihood that the analysis is thoughtful and appropriate.21 Including a statistician on editorial boards and having articles refereed by a statistician may make it "safer" for statistically naïve readers to believe what they read.
This study has several limitations. First, only 1 volume of 1 journal was reviewed, and we excluded the electronic pages, thus the findings may not be generalizable to other journals. For example, a review published in 2003 of 6 journals in 3 nonpediatric subspecialties revealed that a reader could understand 70% of articles with 3 basic concepts: descriptive statistics,
2/Fisher's exact test, and Student's t test.22 Pediatrics was reviewed for this study to allow comparisons with the earlier article.10 The study results may still be broadly applicable because, as the official journal of the American Academy of Pediatrics, Pediatrics has a large circulation and high impact factor, and publishes many articles of interest to both clinicians and researchers. A second limitation is that some statistical procedures actually used in the reviewed articles may have been missed in our review. In that case, our findings can only underestimate the frequency and complexity of statistical procedures that a reader might encounter. Third, no attempt was made to assess the appropriateness or accuracy of the statistical measures and techniques used in each article. Finally, our classification of the statistical measure and procedures represents just 1 possible categorization. The concepts might be grouped in different ways.
| CONCLUSIONS |
|---|
|
|
|---|
This study demonstrates the increasing complexity of statistical analyses encountered in Pediatrics. Goals for education of pediatric residents and other readers may need to be reassessed to emphasize the understanding of principles of statistical inference rather than on the statistical procedures themselves. Authors, reviewers, and journal editors have a greater responsibility for ensuring that statistical procedures are used appropriately, because it may be increasingly unrealistic to expect readers to fully understand the statistical analyses used in journal articles.
| ACKNOWLEDGMENTS |
|---|
We thank Dr Michael S. Kramer for thoughtful comments and suggestions.
| FOOTNOTES |
|---|
Accepted Jan 30, 2007.
Address correspondence to Martha A. Hellems, MD, MS, Department of Pediatrics, University of Virginia, Box 800386, Charlottesville, VA 22908-0386. E-mail: mab4c{at}virginia.edu
The authors have indicated they have no financial relationships relevant to this article to disclose.
| REFERENCES |
|---|
|
|
|---|
- Accreditation Council for Graduate Medical Education. ACGME Outcome Project. Available at: www.acgme.org/outcome/comp/compFull.asp#2. Accessed February 6, 2006
- American Board of Pediatrics. General Competencies. Available at: https://www.abp.org/ABPWebSite. Accessed February 6, 2006
- Cull WL, Yudkowsky BK, Schonfeld DJ, Berkowitz CD, Pan RJ. Research exposure during pediatric residency: influence on career expectations. J Pediatr. 2003;143 :564 –569[CrossRef][Web of Science][Medline]
- Edwards KS, Woolf PK, Hetzler T. Pediatric residents as learners and teachers of evidence-based medicine. Acad Med. 2002;77 :748[CrossRef][Medline]
- Hatala R, Guyatt G. Evaluating the teaching of evidence-based medicine [published correction appears in JAMA. 2002;288:2268].
JAMA. 2002;288
:1110
–1112
[Free Full Text] - Sheridan SL, Pignone M. Numeracy and the medical student's ability to interpret data. Eff Clin Pract. 2002;5 :35 –40[Medline]
- Berwick DM, Fineberg HV, Weinstein MC. When doctors meet numbers. Am J Med. 1981;71 :991 –998[CrossRef][Web of Science][Medline]
- Friedman SB, Phillips S. What's the difference? Pediatric residents and their inaccurate concepts regarding statistics.
Pediatrics. 1981;68
:644
–646
[Abstract/Free Full Text] - Horton NJ, Switzer SS. Statistical methods in the journal.
N Engl J Med. 2005;353
:1977
–1979
[Free Full Text] - Hayden GF. Biostatistical trends in pediatrics: implications for the future.
Pediatrics. 1983;72
:84
–87
[Abstract/Free Full Text] - Anderson K, Lucey JF. Pediatrics electronic pages: looking back and looking ahead.
Pediatrics. 1998;102
:124
–128
[Free Full Text] - Altman DG. Statistics in medical journals: some recent trends. Stat Med. 2000;19 :3275 –3289[CrossRef][Web of Science][Medline]
- Fligor BJ, Neault MW, Mullen CH, Feldman HA, Jones DT. Factors associated with sensorineural hearing loss among survivors of extracorporeal membrane oxygenation therapy.
Pediatrics. 2005;115
:1519
–1528
[Abstract/Free Full Text] - Hollis JF, Polen MR, Whitlock EP, et al. Teen reach: outcomes from a randomized, controlled trial of a tobacco reduction program for teens seen in primary medical care.
Pediatrics. 2005;115
:981
–989
[Abstract/Free Full Text] - Ebell MH, Siwek J, Weiss BD, et al. Strength of recommendation taxonomy (SORT): a patient-centered approach to grading evidence in the medical literature. J Am Board Fam Pract. 2004;17 :59 –67[Medline]
- Slawson DC, Shaughnessy AF. Teaching evidence-based medicine: should we be teaching information management instead? Acad Med. 2005;80 :685 –689[CrossRef][Web of Science][Medline]
- Goodman SN, Altman DG, George SL. Statistical reviewing policies of medical journals: Caveat lector? J Gen Intern Med. 1998;13 :753 –756[CrossRef][Web of Science][Medline]
- Statistically significant. Nat Med. 2005;11 :1[CrossRef][Web of Science][Medline]
- Garcia-Berthou E, Alcaraz C. Incongruence between test statistics and P values in medical papers. BMC Med Res Methodol. 2004;4 :13[CrossRef][Medline]
- Scales CD Jr, Norris RD, Peterson BL, Preminger GM, Dahm P. Clinical research and statistical methods in the urology literature. J Urol. 2005;174 :1374 –1379[CrossRef][Web of Science][Medline]
- Altman DG, Goodman SN, Schroter S. How statistical expertise is used in medical research.
JAMA. 2002;287
:2817
–2820
[Abstract/Free Full Text] - Reed JF 3rd, Salen P, Bagher P. Methodological and statistical techniques: what do residents really need to know about statistics? J Med Syst. 2003;27 :233 –238[CrossRef][Medline]
PEDIATRICS (ISSN 1098-4275). ©2007 by the American Academy of Pediatrics
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||




