Abstract
The public health community faces increasing demands for improving vaccine safety while simultaneously increasing the number of vaccines available to prevent infectious diseases. The passage of the US Food and Drug Administration (FDA) Amendment Act of 2007 formalized the concept of life-cycle management of the risks and benefits of vaccines, from early clinical development through many years of use in large numbers of people. Harnessing scientific and technologic advances is necessary to improve vaccine-safety evaluation. The Office of Biostatistics and Epidemiology in the Center for Biologics Evaluation and Research is working to improve the FDA's ability to monitor vaccine safety by improving statistical, epidemiologic, and risk-assessment methods, gaining access to new sources of data, and exploring the use of genomics data. In this article we describe the current approaches, new resources, and future directions that the FDA is taking to improve the evaluation of vaccine safety.
Vaccination is considered one of the most successful public health achievements of the 20th century and has prevented thousands of deaths and illnesses. Nevertheless, vaccines, like all pharmaceutical products, have some risk of adverse effects. These risks are generally small because of the extensive testing that vaccines undergo before licensure and stringent controls on vaccine manufacture. The high expectations for safety are reinforced by the fact that vaccines designed to prevent diseases are given to healthy people, especially children. Scientific efforts to ensure the safety of vaccination, from clinical trials of new vaccines to epidemiologic studies of rare adverse events (AEs) and their genetic risk factors, are necessary to further improve the safety of vaccines and maintain the public's confidence in vaccination programs. The authors of another article in this supplemental issue of Pediatrics1 discuss the regulatory framework for vaccine licensure in the United States by the US Food and Drug Administration (FDA). In this article we provide an overview of the statistical, epidemiologic, and risk-assessment methods used by the FDA to evaluate vaccine safety throughout the life cycle. Future directions to improve vaccine-safety evaluation are also outlined.
STATISTICAL METHODS FOR SAFETY EVALUATION IN CLINICAL TRIALS: PRELICENSURE ASSESSMENT
Single-Arm Studies
Early-phase clinical trials may consist of several uncontrolled studies aimed at evaluating a vaccine in relatively healthy populations or, sometimes, in certain special populations. Should unexpected, serious AEs occur, assessing possible relation to the vaccine may be difficult, because there is no randomized control group to serve as a standard for comparison (see “Randomized Clinical Trials” below). In these circumstances, however, it may be possible to compare the observed AE rate obtained by combining the studies, if combining is feasible (see “Meta-analysis” below), to a known background rate derived from external sources. This evaluation could potentially detect a signal to be monitored further in subsequent, larger studies. An obvious drawback to this approach is the potential for biases against which randomization guards. Also, how well a background rate could serve as a comparator would depend heavily on how accurate and reliable that rate is and its applicability to the study populations.
Randomized Clinical Trials
The gold standard in study design for assessing intervention effect is the randomized clinical trial. Studies in which subjects are randomly assigned to an intervention or as controls are more sensitive to small-to-moderate intervention effects than are nonrandomized designs, the latter of which can typically detect only large effects. Randomization promotes balance between groups being compared regarding both known and unknown prognostic factors so that the only real difference between the groups is the intervention administered. Thus, when such designs are accompanied by additional features, such as double-blinding, randomized designs are far less subject to “noise” or bias compared with other designs. Consequently, randomized clinical trials permit direct assignment of causality. It is understandable that randomized clinical trials are used before licensure to evaluate the safety of vaccines, especially because that is the time when randomization is still ethical and, thus, feasible. After a vaccine has been licensed, and especially if it is recommended for use by the Centers for Disease Control and Prevention (CDC) Advisory Committee on Immunization Practices, it may be considered unethical to randomize using a placebo control and possibly even when using an active control. However, despite the fact that prelicensure clinical trials can be randomized, their size is generally more limited than that of postmarketing studies. Only if such trials are very large can there be a reasonable chance of observing less common AEs (see “Large Simple Trials” below).
Commonly Used Approach to Evaluating Safety
Phase III vaccine randomized clinical trials typically have been designed to estimate vaccine efficacy on the basis of clinical disease case definitions. Then, safety is assessed by using the same data set. An exception to this process is when efficacy is indirectly inferred from immune response data gathered from a few hundred participants. In this case, a few hundred subjects might be considered inadequate for evaluating safety, so a larger safety trial might be needed. In many instances, the smaller immunogenicity study could be nested within the much larger safety trial. Sample size is often calculated on the basis of the desired efficacy estimation, rather than safety, although the size of the total safety database needed for licensure is considered (the total safety database is the total number of subjects who received the vaccine, in both controlled and uncontrolled studies combined, before licensure). More consideration is now given to how much safety data are needed from randomized trials before licensing a vaccine.
The typical analysis compares AE rates between the vaccine and control groups. Often, a P value of <.05 is viewed as a statistical safety signal, which does not mean that there is necessarily a safety issue but rather that the AE needs further investigation by subject-matter experts. Note also that the figure .05, although commonly used, is quite arbitrary, and other values such as .025 or .10, for example, might be used as well if the sponsor and the FDA reach agreement. The value chosen is determined by the amount of certainty required to infer a causal relationship. Even relatively large trials will usually not have sufficient power to detect rare AEs.
Large Simple Trials
The concept of large simple trials (LSTs) has existed at least since the 1980s,2,3 but these trials were used mainly to evaluate efficacy. With the more recent focus on safety evaluation throughout the product life cycle, LSTs are being considered as one of many approaches to assess safety. Before licensure, an LST might afford the last opportunity to evaluate safety in a randomized setting and with all the benefits that randomized designs provide. Applied to safety ascertainment, LSTs would collect data only on serious AEs, because efficacy, immunogenicity, and common AEs would have already been studied in earlier trials. Keeping data collection to a minimum would keep the trial simple.
There are several advantages to using LSTs to investigate the safety of a vaccine. First, as already mentioned, such studies would permit randomization and all the benefits thereof. Second, the simplicity of the data collection would promote rapid assessment of safety. Finally, the minimal data collection would greatly offset the expense of conducting a large trial. The latter advantage would be expected to increase manufacturers' willingness to undertake such large studies.
In submissions to the Office of Vaccine Research and Review (in the FDA's Center for Biologics Evaluation and Research [CBER]), 2 types of LSTs for evaluating safety have been the fixed-sample-size design and the group-sequential design. In the former, the intended final sample size is specified in advance of trial initiation. In the latter type of LST, the maximum sample size is prespecified with the understanding that the trial may stop earlier if a stopping boundary is crossed or if there is a high probability of it being crossed. Prespecified stopping boundaries for safety must be in place for sequential monitoring. This group-sequential design was used to conduct prelicensure surveillance for intussusception in Merck's RotaTeq rotavirus vaccine trial4; this vaccine was licensed in 2006.
Types of “Hypotheses” Tested
The International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use E9 document “Guidance on Statistical Principles for Clinical Trials”5 refers to 2 general approaches to the statistical analysis of safety data: (1) often called the “conventional” approach and (2) the “noninferiority” approach. The conventional method is what is referred to in the “commonly used approach” described above. In this approach, a vaccine is assumed to be safe (the null hypothesis), and the goal is to discover if it is not safe (the alternative hypothesis) by detecting signals. If we define the relative risk (RR) as the incidence rate in the vaccinated group relative to that in the control group, then a signal is detected if the lower limit of the 2-sided confidence interval excludes a RR of 1, which is analogous to observing a P value of <.05 if the confidence level is 0.95. Note that if the sample size is too small, this approach will tend not to detect signals because of insufficient power. On the other hand, if the sample size is too large, the consequent high power may detect too many false signals. These 2 features should be considered when this conventional approach is contemplated for safety evaluation.
In the noninferiority approach,6 the null and alternative hypotheses are the reverse of the conventional hypotheses: the vaccine is assumed to be unsafe (the null), and the goal is to find evidence that it is safe (the alternative) by ruling out signals. With this approach, a signal is ruled out if the upper limit of the confidence interval for RR excludes the prespecified value d, where d is the increase in RR (caused by the vaccine) that is deemed unacceptable. In this case, if the sample size is too small, a noninferiority design will tend not to rule out signals because the confidence intervals for the RR will tend to be too wide to exclude d. Again, insufficient power is the problem. Conversely, as the sample size increases, the more likely this approach is to rule out false signals in contrast to the conventional approach. This latter feature is an asset when using a noninferiority design in a large trial setting.
A disadvantage of this approach, however, is that sponsors and the FDA must agree prospectively on the value of d (the increase in RR deemed unacceptable). However, it should be noted that in safety evaluation, much more so than for efficacy, such statistical criteria should be viewed as guideposts to aid safety assessment and not as rigid criteria to override medical and subject-matter expert judgments.
Meta-analysis
Meta-analysis is another tool for evaluating safety that may be useful in vaccine-licensure decisions. Meta-analysis is a statistical method for combining results from multiple independent studies designed to answer the same question to synthesize all the information into 1 summary statistic.7,–,9 Individual studies will likely be too small to achieve precise estimates of RR of less common or rare AEs. In addition, sometimes results from multiple studies may be conflicting and demonstrate very different associations. In such cases, meta-analysis might be able to increase statistical power, improve precision of estimates, and provide an overall summary measure from conflicting results.
Although meta-analysis can contribute to the overall safety evaluation of a vaccine, this class of analyses is not without limitations. The quality of a meta-analysis depends, to a large extent, on the quality of the individual studies and the integrity of the process of combining them. Note that it will rarely be appropriate to simply pool all the studies' data and analyze as if they are from a single study. If the separate studies are poorly designed, or compliance is not good, then a meta-analysis of such studies cannot be a panacea to remedy those problems. Also, if the pooling process does not account, in statistically appropriate ways, for the fact that the data come from distinct trials, the results may be questionable. Even if the separate studies are of high quality, there may be a lack of compatibility among them. For example, if the enrolled populations, doses, case definitions, and intensity of surveillance for AEs are not comparable, then a meta-analysis of these high-quality studies may not be advisable. If a series of trials occurred over a long period of time, there may be time-trend differences such as changes in knowledge of the disease, AE definitions, and medical practice. There is also a potential for publication bias if studies with positive findings are more likely to be published, because the meta-analyst can combine only studies that are known. There is also a distinction between posthoc and preplanned meta-analyses. The former are more likely to be fraught with the problems just mentioned, but preplanned meta-analysis, as part of an overall product-development plan, can permit certain studies to be planned so that they are as compatible as reasonably possible.
FDA/CBER TOOLBOX FOR POSTMARKETING VACCINE AE SURVEILLANCE
The 2007 Food and Drug Administration Authorization Act mandates the FDA to develop an active surveillance program to monitor the safety of drugs and biologicals available for use in the general population.10 Aligned within the goals of this initiative, the FDA/CBER has developed a set of tools for biological safety surveillance. Three key components of the toolbox are signal detection (hypothesis generation), signal strengthening and verification, and signal confirmation (hypothesis testing).
Signal-Detection Tools
Passive Reporting Systems
The main example of passive reporting systems is the Vaccine Adverse Event Reporting System (VAERS).11 The VAERS is a nationwide system for passive surveillance of AEs after vaccination that was established in 1990 in response to the National Childhood Vaccine Injury Act of 1986 and jointly managed by the CDC and the FDA. It is a passive surveillance system to which vaccine manufacturers, health professionals, and the public report clinical events temporally associated with vaccination. Benefits of this system include the capacity to detect previously unrecognized AEs, to monitor known reactions to vaccines (including among subpopulations not previously exposed), to identify possible risk factors for further investigation (including rare, unexpected events), and to perform surveillance of specific vaccine lots. Although the VAERS allows near–real-time signal detection through the use of statistical data-mining techniques,12 it has significant limitations including a lack of consistency in reporting, underreporting, uneven quality of the reports, no appropriate denominator data, no unvaccinated control group, and limited capacity for surveillance of long-onset AEs. Because of these constraints, it is usually not possible to assess whether a vaccine caused the reported AE.13
Analytical Near–Real-Time Active Surveillance Tools
The FDA/CBER and the CDC use large health care databases for signal detection in which the denominator population is known. These systems generate numbers of “observed” AEs and compare them with “expected” numbers. The comparator cohort used to generate expected counts of AEs can be historical or concurrent and include prevaccination or postvaccination time windows. The selection of either historical or concurrent comparison groups depends on several factors including the existence of historical or seasonal trends in disease incidence, the frequency of data updates, and the size and composition of the available populations for the study. An example of this approach is the CDC's Vaccine Safety Datalink rapid-cycle analysis, in which methods for sequential monitoring for signal detection are used.14,15 The FDA/CBER is using Medicare data to develop a rapid-assessment system for vaccine safety (eg, for monitoring Guillain-Barré Syndrome) while accounting for 2 important factors: the delay in availability of the medical claims (common to all surveillance systems that rely on medical claims) and the multiple testing of data over time (α spending limits).16 The CBER has also initiated successful collaborations with other federal partners, including the Department of Defense, the Veterans Administration, and the Indian Health Services, for the development of near–real-time surveillance and other analytical study systems. These active surveillance tools are not intended to provide conclusive evidence of an epidemiologic association or prove causality, but because of their timeliness and well-defined population denominators, they permit the near–real-time detection of potential associations between a drug or vaccine and an AE.17
Signal Verification and Data-Quality Assessment
Passive and active surveillance systems generate numerous hypotheses of potential AEs. Even before hypothesis-confirmation studies, signals of potential associations need further evaluation to determine if they are real or spurious. Continuing collection and analysis of data from existing surveillance systems can provide useful information at a low marginal cost.17 Ideally, post–signal-evaluation activities should be described thoroughly in the protocol before commencement of the study. Possible evaluation steps to verify the quality of a signal are described in another article in this supplemental issue of Pediatrics18 and elsewhere.
Signal Confirmation (Hypothesis Testing)
Once the existence of a signal has been verified, more definitive epidemiologic studies are implemented. Because conclusions of the FDA/CBER regarding the safety of medical products have high potential impact (ie, affect physician and public confidence in a vaccine program and contribute to decisions about whether a product remains on the market), rigorous methods for hypothesis testing are used to ensure high data quality. Generally, these methods include medical record review to confirm diagnoses, onset date, confounding factors, and exposure to the vaccine. These studies use logistic regression and other analytical methods to estimate RR and attributable risk.
Although the most common AEs have usually been identified during the clinical trials, serious AEs are generally uncommon and require large study populations for rigorous hypothesis testing. For this purpose, the FDA/CBER and the CDC use networks of databases from managed care organizations, including federal partners (the Department of Defense, the Veteran's Administration, and the Indian Health Services), private partners such as the Vaccine Safety Datalink, and, via research contracts, other outside institutions. The FDA/CBER also participates in multinational investigation efforts such as the World Health Organization's pilot international collaborative analytical study of the risk of Guillain-Barré Syndrome after H1N1 influenza vaccination. Such collaborative studies are expected to become important contributors to the armamentarium for global vaccine-safety monitoring.
In addition to these efforts to improve the ability of the FDA/CBER to directly monitor vaccine safety after licensure, sponsors are requested to submit a pharmacovigilance plan at the time of license application in accordance with the 2005 International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use E2E guidance.19 Typically, sponsors will commit to conducting large observational safety-monitoring studies, and in some cases the Food and Drug Administration Amendment Act gives the FDA authority to require studies to evaluate serious safety concerns.
SAFETY ASSESSMENTS FOR VACCINE COMPONENTS
The public health determination of the balance between a vaccine's acceptable risks and its benefits is a consensus decision among many scientific disciplines. These decisions are based on data relating to the safety and efficacy of the vaccine and on medical judgment. The advent of quantitative methodologies of risk assessment provides new tools that may provide public health decision-makers with additional help in assessing risks and benefits associated with vaccine components.
A well-developed risk-assessment framework was provided by the National Research Council of the US National Academy of Sciences in 1983.20 In this framework, a risk assessment is divided into 4 parts: hazard identification; exposure assessment; dose-response assessment; and risk characterization.
For a specific vaccine component, hazard identification includes the identification of any adverse effects potentially associated with the component. Next, the exposure assessment involves the determination of the amount and frequency of exposure to the vaccine component. For example, if the component is in multiple pediatric vaccines, then an exposure assessment would take both the number of vaccinations and the amount in each vaccination into the assessment.
Dose-response assessment is the relationship between amounts of the vaccine component and AEs and often involves extrapolation from animal toxicity studies to humans. This part of the risk-assessment framework is least certain for vaccine components, especially for pediatric vaccines. First, there have been relatively few animal studies in which the toxicity of specific vaccine components is determined in tests with infant animals. In addition, the primary concern is usually potential adverse neurologic effects, but toxicity in animal studies usually occurs in organs other than the nervous system, which limits the relevance of the data for extrapolation to humans. The dose of the vaccine component a person receives is usually several orders of magnitude lower than the dose at which toxicity is observed. Last, whether effects observed in animal models at any dose apply to humans and how to quantify that relationship and its uncertainty are other important issues.
The final element of the framework is the risk characterization in which the data from the exposure assessment and the dose-response assessment are integrated in a manner that assists risk managers in understanding the risk associated with a vaccine component. When the data are uncertain, this uncertainty is carried through to the risk characterization. Probabilistic methods are used in exposure and dose-response models to provide risk characterizations that are as close to a reflection of the state of knowledge of a risk as possible.
FUTURE DIRECTIONS
Developing an integrated approach to evaluating and monitoring the safety of vaccines throughout their life cycle is the focus of current initiatives in the FDA/CBER. Although each phase of the life cycle has unique challenges, improvements can be made by focusing on 3 broad areas: improving access to and standardization of data across the life cycle; evaluating new statistical and quantitative risk/benefit assessment methods; and increasing the use of genomics information and computational biology to improve vaccine safety.
The FDA/CBER is contributing to the development of data standards and is building an infrastructure for a repository of clinical trial data. These efforts will potentially allow analysts to look across many studies to identify similar events that might not trigger concern as single events in small studies. The Sentinel Initiative10 envisions an expanded network of data systems for postmarketing safety monitoring of medical products including vaccines. This development could potentially involve linkage to vaccine registries and other sources of electronic health information not traditionally used in vaccine-safety systems. Improved access to such data could even be of value for assessing early-phase risks by providing improved background rates of rare AEs in comparable populations. Standardizing both AEs of interest and their definitions is also important.21
Developing and evaluating new statistical methods for evaluating safety throughout the life cycle is also underway. One novel approach being considered involves using statistical techniques to formalize a pattern-recognition approach often now used informally by medical experts to identify AEs.22 One advantage of this approach is that it could apply across all types of data, including clinical trials, AE reports in passive surveillance systems such as the Vaccine Adverse Event Reporting System, and electronic claims and medical record systems for large populations, including data obtained from text-mining and natural-language processing of electronic medical records.
Bayesian methods and adaptive designs are also under consideration. Bayesian methods use a statistical theory and approach to data analysis that provides a coherent method for learning from evidence as it accumulates. Traditional (frequentist) statistical methods formally use previous information only in the design of a clinical trial. In the data-analysis stage, previous information is considered only informally as a complement to, but not part of, the analysis. In contrast, the Bayesian approach uses a consistent, mathematically formal method called Bayes' theorem for combining previous information with current information on a quantity of interest, which can be done throughout both the design and analysis stages of a trial.23 Bayesian analyses use available patient-outcome information including biomarkers that accumulating data indicate might be related to clinical outcome. They also allow for the use of historical information and for synthesizing results of relevant trials.24 Bayesian adaptive designs can be applied to studies of safety information that accrues during the trials, effectively assessing the evidence as it accumulates. As with all multiple comparison problems, handling AEs in clinical trials and surveillance data sets is a challenging statistical problem. Recently developed empirical Bayes methods can account for multiplicity in the evaluation of “signals” obtained by data-mining from spontaneous-reporting AE databases.25,26 They then provide an informative way to summarize the safety-surveillance data, including a quantitative measure of strength of the reporting association between the vaccines and the events.
There is, at present, great interest in the possibility of applying adaptive clinical trials throughout the various stages of clinical development to improve efficiency of clinical development.27 Because of their complexity, adaptive-design–development programs require more (and earlier) planning and documentation. The primary statistical concern of an adequate and well-controlled study is to preserve the control of the overall study-wide type I error rate for all hypotheses tested because of multiplicity related to the multiple adaptation options. An additional concern is avoiding inflation of the type II error rate for the important hypotheses of the study. Other potential issues are described in the International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use E9 guidance document.5
There are 2 major safety considerations in adaptive-design trials. The first one is the safety considerations for subjects in adaptive-design dose-escalation studies early in vaccine development. The other safety considerations relate to the conduct of studies with major expansion in the number of treatment-exposed subjects. Although it is important to have monitoring for serious adverse effects in any large clinical study, the adaptive-design study that is initiated when there is only limited previous patient safety experience has greater uncertainty regarding the potential vaccine-associated risks and, thus, may need more frequent and/or extensive patient assessment for safety parameters during the study. Increasing the safety-data monitoring may not resolve this concern fully, and it may be necessary to take other steps such as enrolling limited numbers of patients until sufficient safety data are accumulated and examined to support expansion of the study to larger numbers of patients enrolled more rapidly.
Although some methods, discussed earlier in this article and elsewhere in this supplement, are currently in use for near–real-time surveillance in large population-based data systems, the best approaches to monitoring and mining these data sources to improve vaccine safety are not known, and additional research is needed.
Quantitative risk/benefit analysis is also on the horizon as a tool for improving the assessment of vaccine safety. Traditional approaches such as decision analysis28 and new approaches based on linked infectious-disease transmission and game-theory models exploring the interplay of disease risk and vaccine safety and effectiveness in vaccination decision-making29 hold promise.
The FDA/CBER has launched the Genomics Evaluation Team for Safety to enhance the use of genomic and other “omic” data along with a systems biology approach for improving biological product safety, including vaccines, through education, policy development, and research. An example of the type of research the FDA/CBER hopes to encourage is the recently published study that found no association of HLA type with the development of arthritis after Lyme vaccine.30 The FDA/CBER is currently conducting a collaborative study to evaluate whether there are specific genetic factors that might place patients at increased risk for idiopathic thrombocytopenia purpura after measles-mumps-rubella vaccine in children aged 12 to 23 months.31
Enhanced computational science resources will also allow exploration of multiple aspects of the safety of biologicals through simulation of the entire vaccine life cycle. These expanding resources of computational power and powerful software systems will usher in a new era of safety review that will allow the FDA/CBER to integrate the vast amount of information relating to biologicals in a way that allows a more effective evaluation of biologicals through their life cycle of use.
CONCLUSIONS
Taking advantage of the synergy of multidisciplinary interaction and starting from a strong foundation, the FDA/CBER is taking advantage of new scientific and technologic advances to improve vaccine safety throughout the product life cycle. Ultimately, such measures are expected to improve the safety of vaccination and further promote public health.
ACKNOWLEDGMENTS
Financial support of this work was provided by the FDA.
We thank Estelle Russek-Cohen, Rick Wilson, and Tsai-Lien Lin for helpful comments and Barbara Krasnicka for contributing references.
Footnotes
- Accepted November 29, 2010.
- Address correspondence to Robert Ball, MD, MPH, ScM, Office of Biostatistics and Epidemiology, Center for Biologics Evaluation and Research, FDA, HFM-210, 1401 Rockville Pike, Rockville, MD 20852. E-mail: robert.ball{at}fda.hhs.gov
FINANCIAL DISCLOSURE: The authors have indicated they have no financial relationships relevant to this article to disclose.
- AE =
- adverse event •
- FDA =
- Food and Drug Administration •
- CDC =
- Centers for Disease Control and Prevention •
- LST =
- large simple trial •
- CBER =
- Center for Biologics Evaluation and Research •
- RR =
- relative risk
REFERENCES
- Copyright © 2011 by the American Academy of Pediatrics