Win Tin1 eloquently articulated in his editorial “Oxygen Therapy: 50 Years of Uncertainty” that neonatal care providers do not understand how best to use oxygen in the most vulnerable premature infants despite >50 years of oxygen therapy in neonatal medicine.1 We do not understand optimal oxygenation management in extremely low gestational age neonates (<28 weeks’ gestation), because we do not know what are safe and effective upper and lower limits of oxygen levels or saturation ranges in both the early and later neonatal courses.1–7 There has been no implementation of the most powerful tool in clinical research, the randomized, controlled trial, to resolve the uncertainty since the early clinical trials in the 1950s.8–15 No randomized control trial has clarified the relation between retinopathy of prematurity (ROP) and blood oxygen (Pao2), transcutaneous oxygen (tco2), or oxygen saturation (Spo2) levels. Furthermore, the effects of “higher” versus “lower” oxygen levels or saturation ranges on ROP, growth, brain, lung, and other organ systems have not been studied with respect to gestational age, time of onset or duration of specified oxygen level or saturation range, or method of oxygen termination. Because of the lack of definitive evidence on which to base policy, neonatal care providers differ widely, with no consensus in their policies, practices, and strong beliefs regarding oxygen management in both the early and later neonatal courses of premature infants.2,16–20 Thus, the study of oxygen therapy in the neonatal population at highest risk for oxygen-related morbidities is an extremely important and urgent issue. We strongly agree with Tin1,2,16 and others18–22 that an adequately powered, large, randomized, controlled trial must be conducted to resolve the uncertainty and determine the impact of different ranges of oxygen levels or saturations, initiated early in the neonatal course, on ROP and other important outcomes such as mortality, long-term neurodevelopmental outcome, bronchopulmonary dysplasia, and growth. One of the most compelling arguments for a randomized trial is that continued treatment of millions of premature infants in ignorance of what is safe and effective oxygenation is not an option. The objectives of this commentary are to advocate for a definitive clinical trial, summarize the background and rationale for the trial, and emphasize important methodological issues that must be considered in such a trial.
The unrestricted use of oxygen proceeded largely without question until clinical trials published in the 1950s established an association between the use of unrestricted, prolonged oxygen exposure and retrolental fibroplasia (or RLF, as ROP was known initially).8–15 Meta-analysis of 3 early, randomized trials compared the effect of restricted versus unrestricted oxygen administration on RLF. This analysis revealed a significant reduction, but not complete elimination, in the occurrence of any RLF (event rate ratio: 0.34; 95% confidence interval: 0.25, 0.46) and of severe RLF (event rate ratio: 0.38; 95% confidence interval: 0.17, 0.85) in the restricted oxygen group.23 Two trials found a statistically insignificant increased risk of mortality.10,11,23 In a separate meta-analysis of the effects of lower versus higher oxygen concentrations on multiple outcomes in preterm infants during 5 early trials (1951–1969), Askie and Henderson-Smart24 found that the restriction of oxygen reduced the incidence and severity of RLF without increasing mortality. They calculated that one would only need to treat 3 infants with restricted oxygen to prevent one infant from having an adverse outcome of death or RLF. The drastic curtailment of oxygen administration in the 1950s, subsequent to the clinical trials, was associated with a dramatic reduction in retinopathy. The oxygen curtailment was also associated with a concomitant increase in death and cerebral palsy.24–27
These events in the 1950s provide important lessons in medical history regarding ROP. In these early clinical trials, some premature infants developed retinopathy in the restricted oxygen group, and the majority of premature infants in the unrestricted, prolonged oxygen group never developed RLF. One lesson, even from 50 years ago, is that oxygen is an important, but not a sufficient, single cause of ROP. Events of the 1950s also illustrate in hindsight the importance of conducting adequately powered, large, masked, randomized studies with long-term outcomes.
Over the course of the 1970s and 1980s, technical development of means to assess an infant’s oxygenation status, either intermittently or continuously, evolved. This included measuring oxygen tension in arterial blood gases or by tco2 monitoring and estimation of hemoglobin oxygenation saturation by pulse oximetry.23 One trial demonstrated no benefit of using intermittent arterial blood gases by umbilical arterial catheters in reducing ROP.13 Another study that evaluated continuous versus intermittent tco2 monitoring showed that continuous tco2 monitoring did not reduce ROP.28 A later analysis of the data from that study suggested that ROP occurred more often when tco2 monitoring was >80 mm Hg (10.7 pKa) in the first 4 weeks of life.29
Among 5 recent observational studies (2 published articles and 3 abstracts), 4 provide evidence of less severe ROP, and 3 provide evidence of less chronic lung disease in nurseries that had policies of lower Spo2 ranges compared with higher Spo2 ranges.16,18,19,30,31 The Spo2 ranges evaluated differed among the 5 studies. Two of the 5 cohort studies suggest that a lower versus higher Spo2 range (Spo2 ∼80 <90% vs >90%) early in the neonatal course can reduce the induction of severe ROP without increasing mortality or cerebral palsy.16,30 Sun18 analyzed data from the Vermont-Oxford Network of infants with birth weights 500 to 1000 g to explore possible association between choice of target Spo2 levels and rate of chronic lung disease, severe ROP, and ROP surgery. Sun found significantly less chronic lung disease, less stage 3 ROP, less need for ROP surgery, and slightly less mortality (although not statistically significantly different) among nurseries that maintained maximum Spo2 ≤ 95% vs >95%.18 A recent national survey of pulse oximetry before and after 2 weeks of life found significantly less retinal ablative surgery in neonatal intensive care units with policies of maximum Spo2 ≤98% vs >98% in the first 2 weeks of life. There was also less stage 3 ROP and less need for retinal ablative surgery in nurseries that had maximum Spo2 ≤92% vs >92% after the first 2 weeks of life.19 Only one observational study suggests that a lower Spo2 range is associated with increased ROP greater than stage 2, but no increase in surgically treated ROP.31 These cohort studies illustrate the ongoing uncertainty about oxygen therapy in premature infants and underscore the importance of conducting a randomized, control trial regarding different Spo2 ranges. The findings of these cohort studies justify testing the hypothesis that a strategy of maintaining a functional Spo2 level in a “lower” versus “higher” range early in the course of extremely low gestational age neonates reduces the incidence of severe ROP without increasing important adverse neonatal outcomes. We plan to test this hypothesis through an international, multicenter, masked, clinical trial in which extremely low gestational age neonates (<28 weeks’ gestation) will be randomly assigned to 1 of 2 scientifically and clinically acceptable pulse oximetry saturation ranges such as 85% to 89% vs 91% to 95% (functional saturation). Acceptability of these ranges would be confirmed additionally through surveys. Randomly assigned intervention would occur shortly after birth and continue through the first several weeks. Tin and Wariyar2 expanded the background and clearly articulated the justification for such a trial in a separate recent publication.
METHODOLOGICAL IMPLICATIONS FOR A TRIAL OF OXYGEN THERAPY
Sufficiently Powered, Randomized Trial
This important research hypothesis can be tested only by using a sufficiently powered, randomized trial that ensures long-term follow-up. The randomized trial is widely accepted as the best way to minimize systematic bias. Too often, however, unreliable or incorrect answers are generated by randomized trials that have insufficient power to detect clinically important, small to moderate effects.32 Sufficient power to detect clinically important, small to moderate effects, in relatively uncommon outcomes such as severe ROP and death, beyond a reasonable doubt may require surprisingly large numbers. Two examples illustrate this. Oral aspirin therapy in myocardial infarction was not widely accepted until after the Second International Study of Infarct Survival in 1988, which enrolled >17 000 patients33 and confirmed a highly significant 23% reduction in mortality. This finding occurred 14 years after the first trial and after 6 trials showed statistically insignificant reductions (between 10% and 30%) in mortality.34 It took 20 years, 15 trials, and >3500 infants before it became accepted that antenatal steroids reduced respiratory distress syndrome and intraventricular hemorrhage by 50% and neonatal mortality by 40%.35,36 Medical research, and specifically neonatal research, needs to find ways of greatly increasing the size of randomized studies. Otherwise moderate but worthwhile benefits may be missed.37
Several hundred patients (15–25 centers) may be sufficient to demonstrate important differences in severe ROP. However, a much larger sample (and many more collaborators) will be needed to exclude smaller, important differences in outcomes such as mortality and disability to adequately address real concerns about the safety of lower oxygen tensions. For example, a 5% difference in an outcome of death or cerebral palsy is “small” but would have major implications for public health. Preliminary calculations suggest that the trial may require a sample size between 2000 and 4000 extremely low gestational age infants (born at <28 weeks’ gestation) to answer these important questions. Participation of centers that undertake long-term follow-up in >90% of their survivors will be necessary.
Thus, the most expedient, ethical, scientifically rigorous way to resolve the uncertainty of oxygen therapy in extremely low gestational age neonates is to conduct a large, multicenter, randomized, masked trial. International collaboration will certainly be needed to ensure timely recruitment of sufficient numbers of extremely premature infants. Furthermore, international collaboration will permit more robust generalizability of the results. Any outcome is more likely to gain broader clinical acceptance, maximizing the benefit to be derived from what is inevitably going to be a major investment of research money. It is unlikely that funding agencies would repeatedly fund trials of the necessary magnitude. Therefore, if it is to be definitive, it must be rigorous and as complete as possible the first time.
The intervention will be different pulse oximetry targets such as 85% to 89% vs 91% to 95%. Masking of oximeters, as was done for the Australian Benefits of Oxygen Saturation Targeting trial,7 is essential to minimize co-intervention and contamination by bias of neonatal care providers. Masking of the pulse oximeters can be accomplished by offsetting the Spo2 readings by ±3% such that each study group (85–89% vs 91–95%) displays the same Spo2 range of 88% to 92%. Actual Spo2 values would appear for Spo2 <85% and >95%. Establishment and maintenance of equipoise throughout the intervention and assessments are imperative, because we do not yet know if potential clinically important reductions in retinopathy may offset increases in other potentially competing outcomes such as mortality or neurodevelopmental/neurosensory disability.
The trial will face at least 1 challenge in this regard. Some neonatal units regard Spo2 >90% as mandatory. Accepting uncertainty about this may be difficult. However, there are cohort data suggesting that lower levels of saturation can reduce retinopathy without increasing mortality or cerebral palsy.16,30 Creating an international climate of equipoise could be enhanced by surveys17–19 of potential study centers to identify local target ranges and establish current limits of collective uncertainty. The trial should compare target ranges for Spo2 within those limits of acceptable uncertainty.
It is essential, both ethically and scientifically, that the trial carefully select and define meaningful outcomes of neonatal intensive care related to oxygen deficit or toxicity. These outcomes include severe ROP, blindness, bronchopulmonary dysplasia, growth, death, and different types of major neurodevelopmental or neurosensory impairment beyond infancy.
Data Safety Monitoring Committee and Plan
It is also essential, both ethically and scientifically, to have an external monitoring committee to ensure that if major differences between the groups with respect to outcomes such as death or severe ROP are detected, they will be detected during the recruitment phase. Appropriate decisions regarding study termination or continuation can be achieved if stringent stopping rules for the Data Monitoring and Safety Committee are based on evidence beyond reasonable doubt of net clinical benefit or harm or futility of finding a difference before recommending trial termination.37 Evidence of net benefit or harm from one outcome should be considered in the context of other major outcomes. For example, it would be inappropriate to terminate recruitment because of a 3% reduction in severe ROP in the lower oxygen group before the trial had accumulated sufficient power to exclude a 6% increase in mortality or severe neurodevelopmental impairment in the same group. In this case, if the trial were terminated prematurely and lower oxygen became the clinical standard, for every infant whose sight was saved, 2 would die or survive with major disability.
Pragmatic Design and Data Collection
Successful conduct of a much larger-scale trial requires that the design of the trial be as simple and pragmatic as possible to optimize recruitment and maximize the quality of data. Collection of information only on variables related to the major outcomes of the trial should enable centers to participate enthusiastically without undue burden. Information on ROP, duration of oxygen therapy, survival, neurodevelopmental, neurosensory, and growth status should be recorded prospectively for this trial. Several recent studies have demonstrated that large-scale recruitment38–42 and follow-up39,43 in prospective perinatal studies is feasible. The wisdom of collecting only the relevant, necessary data are reflected in the following comment by Peto and Baigent:32
Collecting less information may mean bigger numbers and hence better science: many trials still collect ten or a hundred times too much information per patient, often at the behest of study sponsors or their committees. Requirements for large amounts of defensive documentation imposed on trials by well intentioned guidelines … may, paradoxically, substantially reduce the reliability with which therapeutic questions are answered, if their indirect effect is to make randomized trials smaller or even to prevent them starting.
A trial acknowledging that we don’t understand how to provide optimum oxygenation requires extensive education and dialogue with all staff caring for eligible infants. Their insight and support will be crucial. Therefore, one critical element in preparing for this trial is to develop a comprehensive education package that explains the background and rationale of the study that can be used in many national settings.
The planning for such a trial is in progress. The proposed trial, Pulse Oximetry Saturation Trial for Prevention of ROP (POST ROP), will be adequately powered to reliably detect small to moderate, clinically important differences in severe ROP, chronic lung disease, and differences in mortality, adverse neurodevelopmental and neurosensory (visual/auditory) outcome, and growth. The POST ROP Planning Study Group evolved from collective individual and group endeavors, meetings, and discussions of ophthalmologists and neonatologists over the past year. The POST ROP Planning Group welcomes contact from centers that may be interested in participating in a large trial of oxygen therapy. Without whole-hearted international collaboration, we face many more years of uncertainty about one of the most basic priorities of neonatal care—providing an appropriate concentration of oxygen for our patients.
We thank the following members of the Pulse Oximetry Saturation Trial for Prevention of Retinopathy of Prematurity Study Planning Group for review and critique of this commentary: Waldemar Carlo, John Flynn, William Good, Jeffrey Horbar, Alan Jobe, Earl Palmer, Betty Vohr, and David Wallace (United States); Lisa Askie, Anne Cust, Peter Davis, David Henderson-Smart, Jane Lloyd, Colin Morley, and John Simes (Australia); Edmund Hey and Win Tin (United Kingdom); Christian Poets (Germany); and Keith J. Barrington, Barbara Schmidt, and Jack Sinclair (Canada).
- Received January 13, 2003.
- Accepted May 28, 2003.
- Address correspondence to Cynthia H. Cole, MD, MPH, Division of Newborn Medicine, Floating Hospital for Children, Tufts-New England Medical Center, 750 Washington St, Boston, MA 02111. E-mail:
This editorial is dedicated to the memory of Dr. Douglas K. Richardson, whose life was a testament to the ideal of mutually supportive collaboration.
- ↵Tin W. Oxygen therapy: 50 years of uncertainty. Pediatrics.2002;110 :615– 616
- Lucey JF, Dangman B. A reexamination of the role of oxygen in retrolental fibroplasia. Pediatrics.1984;73 :82– 96
- Chan-Ling T GB, Stone J. Supplemental oxygen therapy: basis for noninvasive treatment of retinopathy of prematurity. Invest Ophthalmol.1215;36 :1215– 1230
- The STOP-ROP Multicenter Study Group. Supplemental therapeutic oxygen for prethreshold retinopathy of prematurity (STOP-ROP), a randomized, controlled trial. I: Primary outcomes. Pediatrics.2000;105 :295– 310
- ↵Askie L, Henderson-Smart D, Irwig L, Judy Simpson. Oxygen-saturation targets and outcomes in extremely preterm infants. N Engl J Med.003;349 :959– 967
- ↵Patz A, Hoeck LE, De LA, Cruz E. Studies on the effect of high oxygen administration in retrolental fibroplasia. I. Nursery observations. Am J Ophthalmol.1952;36 :1248– 1253
- ↵Kinsey VE, Arnold HJ, Kalina RE, et al. PaO2 levels and retrolental fibroplasia: a report of the cooperative study. Pediatrics.1977;60 :655– 668
- Engle MA, Baker DH, Baras I, et al. Oxygen administration and retrolental fibroplasia. Am J Dis Child.1955;89 :399– 413
- ↵Engle MA, Levine SZ. Response of small premature infants to restriction of supplementary oxygen. Am J Dis Child.1955;89 :316– 324
- ↵Tin W, Milligan DW, Pennefather P, Hey E. Pulse oximetry, severe retinopathy, and outcome at one year in babies of less than 28 weeks gestation. Arch Dis Child Fetal Neonatal Ed.2001;84 :F106– F110
- ↵Sun SC. Relation of target SpO2 levels and clinical outcome in ELBW infants on supplemental oxygen [abstract]. Pediatr Res.2002;51 :350A
- ↵Anderson CG, Benitz WE, Madan A. Retinopathy of prematurity (ROP) and pulse oximetry: a national survey of recent practices [abstract]. Pediatr Res.2002;51 :367A
- Saugstad OD. Is oxygen more toxic than currently believed? Pediatrics.2001;108 :1203– 1205
- ↵Oxygen and retrolental fibroplasia: the questions persist. Pediatrics.1977;60 :753– 754
- ↵Duc G, Sinclair JC. Oxygen administration. In: Sinclair JD, Bracken MB, eds. Effective Care of the Newborn Infant. Oxford, United Kingdom: Oxford University Press; 1992:178–198
- ↵Askie LM, Henderson-Smart DJ. Restricted versus liberal oxygen exposure for preventing morbidity and mortality in preterm or low birth weight infants. Cochrane Database Syst Rev. 2000;(2):CD001077
- McDonald A. Neurological and ophthalmic disorders in children of very low birth weight. Br Med J.1962;1 :895– 900
- ↵Cross KW. Cost of preventing retrolental fibroplasia? Lancet.1973;2(7835) :954– 956
- ↵Chow LC, Wright KW, Sola S. Can changes in clinical practice decrease the incidence of severe retinopathy of prematurity in very low birth weight infants? Pediatrics.2003;111 :339– 345
- ↵Peto R, Baigent C. Trials: the next 50 years. Large scale randomised evidence of moderate benefits. BMJ.1998;317 :1170– 1171
- ↵ISIS-2 (Second International Study of Infarct Survival) Collaborative Group. Randomised trial of intravenous streptokinase, oral aspirin, both, or neither among 17,187 cases of suspected acute myocardial infarction: ISIS-2. Lancet.1988;2(8607) :349– 360
- ↵Elwood P. Cochrane and the benefits of aspirin. In: Maynard A, Chalmers I, eds. Non-random reflections on health services research: on the 25th anniversary of Archie Cochrane’s effectiveness and efficiency. London, United Kingdom: BMJ Publishing Group; 1997:107–121
- ↵Crowley P. Prophylactic corticosteroids for preterm birth. Cochrane Database Syst Rev. 2003;3
- Tucker J, Parry G, Nicholson P, McCabe C, Tarnow-Mordi W; UK Neonatal Staffing Study Group. Patient volume, staffing, and workload in relation to risk-adjusted outcomes in a random stratified sample of UK neonatal intensive care units: a prospective evaluation. Lancet.2002;359 :99– 107
- ↵Horbar JD, Badger GJ, Carpenter JH, et al. Trends in mortality and morbidity for very low birth weight infants, 1991–1999. Pediatrics.2002;110 :143– 151
- Copyright © 2003 by the American Academy of Pediatrics