a Departments of Obstetrics and Gynaecology, Cambridge University, Cambridge, United Kingdom
b Medical Research Council Biostatistics Unit, Institute of Public Health, Cambridge, United Kingdom
| ABSTRACT |
|---|
|
|
|---|
METHODS. A population-based retrospective cohort study was conducted of data from the linked Scottish Morbidity Record, Stillbirth and Infant Death Enquiry and General Registrar's Office database of births and deaths, encompassing births in Scotland between 1992 and 2001. All women who had a singleton live birth between 24 and 43 weeks' gestation and for whom data were available (n = 505011), divided into model development and validation samples, were studied. The main outcome measure was death of the infant in the first year of life as a result of SIDS.
RESULTS. The risk for SIDS was modeled in the development sample using logistic regression with the following predictors: maternal age, parity, marital status, smoking, and the birth weight and the gender of the infant. When the model was evaluated in the validation sample, the area under the receiver operating characteristic curve was 0.84 and the incidence of SIDS was 0.7 per 10000 (95% confidence interval: 0.31.4) among 126253 women in the lower 50% of predicted risk and 29.7 per 10000 (95% confidence interval: 23.437.2) among the 25250 women in the top 10% of predicted risk. A logistic-regression model then was developed for the whole population, and the output was converted into adjusted likelihood ratios. These are tabulated and provide a simple method for assessing the risk for SIDS associated with any combination of obstetric characteristics.
CONCLUSIONS. A model that uses maternal characteristics and outcome at birth is predictive of the risk for SIDS. This model is presented in a simple form that allows calculation of the individual risk for SIDS.
Key Words: pregnancy outcome sudden infant death risk
Abbreviations: SIDSsudden infant death syndrome SMR2Scottish Morbidity Record GROGeneral Registrar's Office ORodds ratio ROCreceiver operating characteristic CIconfidence interval
The factors that determine the risk for sudden infant death syndrome (SIDS) have been the focus of studies for many years.1 Identification of modifiable environmental exposures led to the "Back to Sleep" campaign and a dramatic fall in the incidence of SIDS. Despite this, SIDS remains the most common cause of death in infancy.2,3 After an apparent SIDS death, there should be an analysis of all of the factors that may have contributed to the event. The procedures for this have been reviewed recently4,5 and include detailed investigation of the scene of death and a thorough autopsy. Previous risks for SIDS are also taken into account in this process, including an assessment of whether there were any obstetric risk factors for SIDS. Many studies have addressed both prenatal and postnatal predictors of the risk for SIDS.1 However, these analyses are presented in a manner that does not allow easy and accurate assessment of the absolute risk associated with a given combination of characteristics. Our aim was to (1) develop a valid model that relates the risk for SIDS accurately to obstetric characteristics and (2) present it in a format that is simple to understand and use.
| METHODS |
|---|
|
|
|---|
Data Sources and Patient Selection
The SMR2 collects information on clinical and demographic characteristics and outcomes for all women who are discharged from Scottish maternity hospitals. The register is subjected to regular quality assurance checks and has been >99% complete since the late 1970s.6 The Scottish Stillbirth and Infant Death Enquiry is a national register that routinely classifies all perinatal deaths in Scotland.3 The GRO maintains computerized birth and death registration records. A probability-based matching approach7 that used maternal identifiers to link the SMR2, the Scottish Stillbirth and Infant Death Enquiry, and the GRO database of birth certificates was used. The birth certificate contained offspring identifiers that then were used to link the pregnancy and perinatal death data to the death certificate register to identify deaths in infancy.
Definitions
SIDS was defined as death of an infant in whom the principal cause on the GRO death certificate was coded as 798.0 using the International Classification of Diseases, Ninth Revision or R95 using International Classification of Diseases, 10th Revision. During the period studied, a diagnosis of SIDS could be written on a death certificate in Scotland only after thorough investigation of the circumstances of the death. The minimum requirements are described by the Crown Office,8 and an autopsy was mandatory. In practice, the investigation of these deaths was frequently much more involved.9 A previous detailed study of deaths attributed to SIDS on Scottish death certificates between 1992 and 1995 found that standard diagnostic criteria were fulfilled in all cases.10 Maternal age was defined as the age of the mother at the time of delivery. Smoking and marital status were defined as the status of the woman at the time of first attendance for prenatal care. Parity was defined as the total number of previous births, excluding abortions. Gestational age at birth was defined as completed weeks of gestation at the time of delivery. Gestational age has been confirmed by ultrasound scan in the first half of pregnancy in >95% of pregnancies in the United Kingdom from the early 1990s.11
Statistical Analysis
Univariate comparisons were performed using the Mann-Whitney U test, the
2 test, and the
2 test for trend, as appropriate. The P values for all hypothesis tests were 2-sided. Crude and adjusted odds ratios (ORs) were obtained by using logistic-regression analyses.12 Parity, maternal age, and the infant's birth weight all were treated continuously in logistic-regression models. Treating these variables in this manner avoids loss of information as a result of categorization. We excluded cases with extremes of birth weight (<500 or >5000 g) to avoid overly influential effects of outliers. This improves the reliability of modeling of birth weight for the vast majority of the population. Because very small numbers of cases had values outside this range, estimates of probability for these extreme cases would be potentially unstable. Nonlinearity in the log odds scale was tested and modeled using fractional polynomials. Regression techniques used robust standard errors to allow for dependence of different births to the same mother using a maternal identifier. The statistical significance of interaction terms was assessed using the Wald test, and significance was assumed at P < .01. Observations with missing values were excluded. The population was randomly assigned to a model development and a model validation sample. Goodness of fit was assessed using the Hosmer-Lemeshow test based on deciles of probability. The discrimination of the model was assessed by the area under the receiver operating characteristic (ROC) curve. The final logistic-regression model fitted to the entire cohort was converted into adjusted likelihood ratios using a modification of our recently described method13 (see Appendix for details). All statistical analyses were performed using the Stata 8.2 software package (Stata Corp, College Station, TX).
| RESULTS |
|---|
|
|
|---|
|
|
|
|
|
| DISCUSSION |
|---|
|
|
|---|
We developed a logistic-regression model and related the risk for SIDS to marital and smoking status, maternal age and parity, and the birth weight and the gender of the infant. Birth weight is a reflection of both fetal growth and gestational age at birth. We previously demonstrated log linear relationships between the risk for SIDS and both week of gestation of birth and birth weight percentile.14 In the present study, we used birth weight. Performance of the model was virtually identical to a model using week of gestation at birth and birth weight percentile (data not shown). Birth weight has the advantage of being less dependent on definition than gestational age and is more likely to be known than the exact birth weight percentile. We assessed the calibration and discrimination of the model in the validation sample. This demonstrated that the model fit the out-of-sample data well and had good discriminative power. A previous study had performed out-of-sample validation of 4 risk scoring systems for SIDS and found sensitivities of 41%, 53%, 62%, and 71% when the top 20% of predicted risk were classified as high risk; the best performing model included 17 predictors.15 In our own study using a model that had just 6 predictors, 72% of cases in the validation sample were in the top 20% of predicted risk.
A number of studies have shown that women with a previous SIDS event have an approximately fivefold risk for recurrence compared with the general population.1619 In the United Kingdom, these women are offered a structured scheme for the care of the next infant, which involves symptom diaries, apnea monitors, scales, and weekly home visits by the family health visitor.20 Logically, the 2.4% of women with a summary likelihood ratio of
5 on the basis of the model might be offered a similar intervention, although this would require additional evaluation of efficacy and economic justification. However, application of this model assumes that the relationships between the variables studied and the risk for SIDS are similar in other populations.
The absolute risk for an outcome associated with a given combination of characteristics can be estimated from a logistic-regression equation using the constant and the coefficients. The constant reflects the baseline risk, and the sum of the coefficients reflects the modification of the baseline risk associated with the given combination of characteristics. However, typically, medical publications do not report the constant; therefore, this calculation cannot be performed. Moreover, even if provided with the constant and the coefficients, only a tiny minority of doctors would have the knowledge to perform this calculation. We sought to simplify estimation of the absolute risk from a logistic-regression model by expressing the output as adjusted likelihood ratios rather than as ORs. In fact, a likelihood ratio is merely a special type of OR. Taking the example of expressing the risk for a given outcome among male individuals, the OR associated with being male is the odds of the disease in male individuals divided by the odds of the disease among female individuals. The likelihood ratio associated with being male is the odds of the disease in male individuals divided by the odds of the disease in the whole population. Therefore, the OR expresses the risk relative to another category of the given characteristic (eg, male relative to female), whereas the likelihood ratio expresses the risk relative to the whole population. Using the example of gender, 2 likelihood ratios are generated: 1 expresses the risk for male individuals, and 1 expresses the risk for female individuals.
Estimating the absolute risk of a given event associated with any combination of characteristics is relatively simple using adjusted likelihood ratios (see Fig 2). The prior risk of disease is the odds in the population. The risk associated with any combination of variables is calculated by multiplying the prior risk by the appropriate likelihood ratios (Table 3). Therefore, estimating the absolute risk requires relatively little statistical expertise. Because the output of the model is in the form of an individual estimated probability, our approach avoids the loss of information involved in dichotomizing infants as "high" or "low" risk on the basis of an arbitrary threshold on an abstract numerical scale. Informing parents that their infant is at high risk of SIDS may cause unjustified anxiety, because the risk may be small in absolute terms. The likelihood ratio based approach has the key advantage that the output of the model is expressed in terms of the absolute risk associated with the given individual's combination of characteristics.
Expressing logistic-regression models in the form of adjusted likelihood ratios has several other advantages. First, if a predictor variable is unknown, then it may simply be ignored: omitting a variable in a likelihood ratiobased model makes the plausible assumption that the individual has the background risk for the population in relation to the given characteristic. Second, the use of adjusted likelihood ratios removes the need to select a reference category. In contrast, in logistic-regression analysis, a category of risk has to be regarded as referent. By choosing an extreme category as referent, ORs for all of the other categories will tend to be farther from unity. Therefore, the OR for a given characteristic may reflect the deviation in risk from the rest of the population in the referent category as well as the category in question. In contrast, by expressing the output of logistic-regression models as likelihood ratios, the odds of disease associated with any given feature are expressed relative to the odds in the whole population. Finally, because the model uses the previous odds as the starting point, there is the potential for using the adjusted likelihood ratios in other populations in which the incidence of the disease is higher or lower and accounting for this by using the local incidence to estimate the previous odds. This should be done carefully, however, as it assumes that variation in the incidence between populations does not depend on variation in the prevalence of the risk factors included in the model.
Other multivariate methods can be used to generate adjusted likelihood ratios, such as distribution modeling, which is used widely in Down syndrome screening.21 However, these methods do not directly incorporate binary variables, such as gender. Moreover, logistic-regression modeling is much more widely used in the analysis of risk, and many model-building tools have been developed for this method. A previous attempt was made to express logistic regression in the form of likelihood ratios.22 However, the previous method of calculation does not agree with the multivariate logistic-regression output if the model contains categorical variables with >2 levels or if the transformation of a continuous variable changed between the univariate and the multivariate model.
| CONCLUSIONS |
|---|
|
|
|---|
| APPENDIX: ESTIMATING LIKELIHOOD RATIOS |
|---|
|
|
|---|
In the first stage, we fit the above model with the term b1x1 replaced with an unknown constant d1 and with all other terms (including the constant) fixed at their previous estimated values. The estimated value of d1 captures the risk before x1 is known, so the likelihood ratio is exp(b1x1 d1). This is repeated for each term b2x2, ..., bnxn, and the constant is replaced by a' = (a + d1 + d2 +...+ dn).
Because of the nonlinearity of the log odds function, a' may not exactly equal the overall log odds of the outcome, a0, if the x variables are correlated. In the second stage, we therefore compute a correction factor cj(a0 a'), where c1 +...+ cn = 1, and we report likelihood ratios exp[bjxj dj + cj(a0 a')]. In this article, cj is calculated as mj/(m1 +...+ mn), where mj is the sample minimum or maximum (depending on whether a0 a' is positive or negative) of bjxj dj: this procedure ensures that the range of adjusted likelihood ratios spans 1.
| ACKNOWLEDGMENTS |
|---|
| FOOTNOTES |
|---|
Address correspondence to Gordon C.S. Smith, MD, PhD, Department of Obstetrics and Gynaecology, Cambridge University, Rosie Maternity Hospital, Robinson Way, Cambridge CB2 2QQ, United Kingdom. E-mail: gcss2{at}cam.ac.uk
The authors have indicated they have no financial relationships relevant to this article to disclose.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
S. Lainwala, R. Perritt, K. Poole, B. Vohr, and for the National Institute of Child Health and Hum Neurodevelopmental and Growth Outcomes of Extremely Low Birth Weight Infants Who Are Transferred From Neonatal Intensive Care Units to Level I or II Nurseries Pediatrics, May 1, 2007; 119(5): e1079 - e1087. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Keogh Predicting the Risk of Sudden Infant Death Syndrome From Obstetric Characteristics Pediatrics, January 1, 2006; 117(1): 210 - 211. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||