## Abstract

*Objective.* The high prevalence of antibody to hepatitis A virus (HAV) in the US population suggests that the incidence of infection is much higher than reported, but the infection rate is difficult to measure directly because of anicteric infection and underreporting. We present a model that reconciles the reported incidence of hepatitis A with the observed prevalence of antibody to HAV and provides an estimate of the true incidence of HAV infection.

*Methods.* In the model, reported incidence of hepatitis A in the United States was adjusted to account first for anicteric infection and then for underreporting and declining incidence over time such that the prevalence predicted by the model approximated that observed in 2 nationwide surveys.

*Results.* The model showed incidence in the susceptible population declining by 4.5% per year. As incidence declined early in the 1900s, the average age at infection increased, leading to a paradoxical increase in the incidence of icteric infection followed by a slow decline. The model estimated approximately 270 000 (range: 190 000–360 000) infections annually from 1980 to 1999, 10.4 times the number of hepatitis A cases actually reported during this period. More than half of these infections occurred in children who were younger than 10 years, most of which would have been clinically unrecognizable as hepatitis.

*Conclusions.* These results suggest a large reservoir of infection in children and that interruption of transmission in children may substantially reduce incidence of hepatitis A overall.

Hepatitis A virus (HAV) infection is one of the most common viral infections worldwide.1 Acute infection in adults usually results in a clinical presentation, indistinguishable from the other viral hepatitides, characterized by the acute onset of malaise, anorexia, and abdominal discomfort followed after a few days by jaundice (icterus). Illness lasts from 2 weeks to several months but does not lead to chronic hepatitis. Fulminant, life-threatening hepatitis A is an uncommon complication of acute HAV infection. In young children, HAV infection is most often asymptomatic or characterized by nonspecific symptoms (eg, fever) that are indistinguishable from other viral infections.

In the United States, there were on average 26 000 cases of acute hepatitis A reported to public health agencies each year from 1980 to 1999.2,3 Highly effective vaccines against hepatitis A have been available in the United States since the mid-1990s.4 Currently, vaccination is recommended for people with certain risk factors and for children who live in communities where the reported annual incidence exceeds twice the average annual rate in the United States of approximately 10 per 100 000.4 There continues to be disagreement about whether routine hepatitis A immunization should be extended to children outside these areas.5,6 One benefit from routine childhood immunization may be the reduction in hepatitis A in adults.7

Dynamic modeling of disease transmission is useful for evaluating the potential impact of proposed immunization programs.8 However, application of these models requires knowledge of the incidence of HAV infection, which is unknown in the United States. Reported cases of hepatitis A are an unreliable indicator of the true incidence of infection because of anicteric infection and underreporting of clinical cases. Measuring incidence directly with serial serologic surveys would be impractical because of the required size of the cohort. As an alternative, we present a model to integrate acute disease surveillance data with data from 2 large cross-sectional surveys to estimate the current and past incidence of HAV infection.

## METHODS

### Catalytic Modeling

We estimated the incidence of HAV infection by means of “catalytic modeling,”9–12 a method of estimating the past incidence of infection from the current prevalence of antibody to the infectious agent. The underlying assumption of the model was that each person in the US population with antibody to HAV (anti-HAV) represented 1 HAV infection that had occurred in the past. Thus, because the third National Health and Nutrition Examination Survey (NHANES III, see below) estimated that 47 million US-born people were anti-HAV positive at the time of the survey, this suggests that 47 million infections had occurred in that population at some time in the past. We used modeling to infer when those infections had occurred and in what age groups.

### Modeling Approach

The overall approach was to make 3 adjustments to the reported incidence of hepatitis A such that the predicted prevalence of anti-HAV in the population approximated that observed in 2 nationwide prevalence surveys, the NHANES II and NHANES III (see below). We first divided reported incidence by the proportion of the population that was susceptible to HAV infection (ie, 1 − the prevalence of anti-HAV) to estimate reported incidence in susceptible individuals. Then, to account for anicteric infections, which we assumed not to have been recognized or reported, we divided this result by the age-specific probability that infection would result in jaundice. This probability was modeled separately (see “Probability of Jaundice Model” below). Finally, this result was multiplied by 2 factors, 1 to account for underreporting and a second to account for declining incidence over time. Both of these factors were estimated by regressing the anti-HAV prevalence predicted by the model against that observed in the 2 prevalence surveys. Two equations were used to model the decline in incidence over time, 1 that assumed an exponential decline over time and 1 that made no a priori assumptions about the dynamics of the decline. Because both equations gave similar results, only those from the simpler, exponential model are shown here.

We tested the sensitivity of the model to the estimated uncertainty in the probability of jaundice, the estimated annual rate of decline in infection incidence, and the assumed rate of infection among children who were younger than 1 year. We also accounted for uncertainties attributable to sampling in the prevalence data by means of the balanced repeat replicate method.13 The mathematical details of the modeling and sensitivity analysis are given in the “Appendix.”

### Sources of Data

#### Probability of Jaundice Model

To model the age-specific probability of developing jaundice, we used data from acutely infected individuals in 7 independent studies—6 published14–19 and 1 unpublished (J.C. Victor, University of Michigan, unpublished data). Each of these studies of community-wide hepatitis A outbreaks included serologic surveys of symptomatic and asymptomatic children or adults. When raw data were available, 0.5 years was added to each age (2-year-olds, for example, were assumed to be, on average, 2.5 years old). When data were available only by age group, age was assumed to equal the midpoint of the age group (people in the 5- to 9-year-old age group, for example, were assumed to be, on average, 7.5 years old).

#### Force of Infection Model

The model estimated “force of infection,” which is approximately equivalent to incidence in the susceptible (ie, anti-HAV-negative) population,8 from various sources of raw data. The reported number of hepatitis A cases from 1980 to 1999 was obtained from the National Notifiable Diseases Surveillance System,2 which includes all cases of hepatitis A reported to state and local health departments in the United States. The case definition in use during this time period required a discrete onset of symptoms, jaundice or elevated transaminases, and a positive test for immunoglobulin M anti-HAV. Cases for which age or gender were missing (6% of all cases) were assumed to have the same distribution of age and gender as cases for which these data were not missing. Intercensal and postcensal residential population estimates were used in the calculation of rates (Population Estimates Program, Population Division, US Census Bureau).

The age-specific prevalence of anti-HAV, *P*_{v}(*a*) was estimated from 2 national prevalence surveys, NHANES II (1976–1980)20 and NHANES III (1988–1994).21 Both surveyed samples of the civilian, noninstitutionalized population to provide nationally representative prevalence estimates for a variety of medical conditions. In both surveys, serum samples from the participants were tested for anti-HAV using the same enzyme immunoassay (HAVAB; Abbott Laboratories, Abbott Park, IL). Two different prevalence estimates were used in the model. The age-specific prevalence of anti-HAV in the total population, *P _{v,t}*(

*a*), was used to adjust reported incidence to estimate the reported incidence in the susceptible population. The age-specific prevalence of anti-HAV in the US-born population,

*P*

_{v,u}(

*a*), was used in the model that estimated incidence of infection.

## RESULTS

### Probability of Jaundice Model

By fitting the model to all 7 studies, we estimated the age-specific probability of developing jaundice during acute HAV infection to be *P*_{j} = 0.852 · (1 − exp[−0.01244 · a^{1.903}]), where *a* is the age at the time of infection, in years. The model fit the data well in all age groups (Fig 1), and no trends were noted in the residual deviance (a measure of goodness of fit) with respect to age. The average probability of jaundice was 7.2% (95% confidence interval [CI]: 4.7%–10.9%) in 0- to 4-year-olds, 37.1% (95% CI: 30.7%–43.8%) in 5- to 9-year-olds, 70.7% (95% CI: 58.8%–79.4%) in 10- to 17-year-olds, and 85.2% (95% CI: 79.9%–89.2%) in adults aged 18 years and older.

### Incidence of HAV Infection

The average reported age-specific incidence of hepatitis A from 1980 to 1999 was highest in children aged 5 to 9 years and adults aged 20 to 29 years (Fig 2). Incidence remained highest in these age groups after adjusting reported incidence for the prevalence of anti-HAV to estimate the incidence in the susceptible population (Fig 2). Adjustment for anicteric infections using the probability of jaundice model greatly increased the incidence in younger age groups (Fig 2).

The final model fit the observed prevalence data well (Fig 3) and showed force of infection declining by 4.5% per year. Early in the century, incidence of infection would have been particularly high in the youngest age groups, exceeding 10% per year in children who were younger than 5 years (data not shown). The declining force of infection would have led to increases in both the average age at infection, from 3.4 years in 1920 to 15.7 years in 2000, and the average age at icteric infection, from 11.5 to 28.3 years. The incidence of icteric hepatitis A would have increased in the early 1900s, peaked in the first half of the century, and then declined but at a slower rate than the decline in the force of infection (Fig 4)

From 1980 to 1999, the model estimated an average of 271 000 infections per year, or 10.4 times the number of hepatitis A cases reported during that time period Table 1.

Because a majority of these occurred in children who were younger than 10 years, fewer than half (111 800) resulted in icteric infection. Thus, 1 in every 4.3 cases of icteric HAV infection was reported to health departments during this period Table 1.

Incidence of infection declined with increasing age except for an increase from the 15- to 19-year age group to the 20- to 29-year age group Table 1. In any given birth cohort, the decline in incidence with age would have been compounded by the 4.5% annual decline in force of infection, such that the majority of infections would have occurred before age 5. Thus, after childhood, the prevalence of anti-HAV in a birth cohort would increase relatively slowly, although at a given point in time most cases of clinical hepatitis A would occur in adults.

### Sensitivity Analysis

Varying the probability of jaundice from its lower to upper 95% confidence limits (CL) caused the estimated annual incidence of HAV infection to vary from 78.1 to 146.2 infections per 100 000 Table 2.

In general, the higher the probability of jaundice, the lower the incidence in young children and the higher the incidence in adults. Because adults make up a substantially larger proportion of the population, higher probabilities of jaundice resulted in higher overall incidence. Varying the probability of jaundice affected the estimated incidence of icteric hepatitis A to a greater extent than the incidence of HAV infection (Table 2).

Varying the rate of decline in force of infection (δ) by 0.003 increased the deviance of each model by 17.9 to 21.7 and caused the estimated incidence of infection to vary from 95.3 to 123.9. Increases or decreases of the rate of decline outside of these boundaries resulted in substantial worsening of the fit of the model.

The 95% CL of the overall incidence of infection, estimated by balance repeat replicate simulation, ranged from 78.1 to 146.2 per 100 000 (195 000–364 000 infections), and estimates of the incidence of icteric hepatitis A ranged from 20.7 to 80.2 per 100 000 (51 000–200 000 icteric infections; Table 2).

The model was relatively insensitive to changes in the assumed force of infection in infants who were younger than 1 year. Estimated overall incidence of HAV infection varied from 109.5 to 97.2 as force of infection in infants who were younger than 1 year was varied from 0 to 1.5 times the force of infection in 1-year-olds.

## DISCUSSION

We modeled the incidence of HAV infection in the United States from 1980 to 1999 by making several adjustments to the reported incidence of hepatitis A such that the expected prevalence of anti-HAV closely matched that observed in 2 large prevalence surveys. The estimated total number of infections was 7.4 to 13.9 times the number of hepatitis A cases reported during this period. Because most of these infections would have occurred in children, only a minority (2.0–7.6 times the reported number of infections) would have been accompanied by jaundice and thus clinically recognizable as hepatitis.

The model also predicts that incidence has decreased logarithmically over time and that the increasing prevalence of anti-HAV with age in the US-born population is attributable primarily to cohort effects rather than to high rates of infection in older Americans. Similar cohort effects have been noted in Germany22 and Australia.23 The steady 4.2% to 4.8% annual decline in HAV infection incidence is consistent with the observed logarithmic decline in infectious disease mortality rates in the United States throughout the 20th century.24 In particular, deaths from diarrheal diseases, whose modes of transmission are comparable to those of HAV, declined by approximately 6% per year from 1900 to 1980 (Centers for Disease Control and Prevention, unpublished data).

Although our estimates of incidence early in the century should be considered less reliable than our estimates of recent incidence, they probably reflect the general trends in hepatitis A in that time period. The predicted increase in incidence of clinical hepatitis A during a time of declining incidence of HAV infection is not without precedence. Israel, for example, observed an increase in hepatitis A cases coincident with improvements in sanitary conditions.25 In this respect, hepatitis A and polio are similar: as socioeconomic conditions improved, the average age of infection increased, leading to more clinically apparent infections. In the case of polio, the greater number of infections occurring after maternal antibody loss contributed to an increase in paralytic disease in the 1940s, culminating in the epidemics of the early 1950s.26 With hepatitis A, the increase probably took place earlier and was more gradual.

Our model has several limitations. It does not take into account heterogeneity in the population. It is well-established, for example, that the incidence of hepatitis A is higher in certain regions of the United States, particularly in the West and Southwest, and in people of lower socioeconomic status.4 Incidence also fluctuates from year to year, with nationwide epidemics occurring approximately every 10 to 15 years.4 Our model averages incidence across all groups and smoothes out the effect of these epidemics. The model also assumes that the shape of the age-specific force of infection curve has been constant over time. Although this may not have been true, violations of this assumption would have had little effect on the estimated incidence as long as force of infection was always much higher in young children than in adults.

Previous studies in the United States have suggested that transmission from children with clinically inapparent HAV infection may be responsible for a substantial proportion of infections in adults.14–18,27–29 If incidence in children is considerably higher than in adults, as predicted by our model, then children would make up a large proportion of the reservoir of infection at any given point in time. These children may be particularly efficient transmitters of infection because they may excrete virus for longer periods of time than adults30,31 and are not as careful in their hygiene.32This may explain why the reported incidence of hepatitis A increases from adolescence to young adulthood, when contact with young children presumably increases. Hepatitis A may be analogous to influenza in that children may harbor much of the reservoir of infection while adults incur most of the burden of disease. Although not demonstrated conclusively, there are many data to suggest that immunizing children against influenza may result in substantial reductions in influenza morbidity33 and mortality34 among adults.

In the United States, routine immunization against hepatitis A has been recommended since 1996 for children who live in certain communities.4,35 Because force of infection declines dramatically with age, mathematical modeling would predict that routine immunization should have particularly strong dynamic effects (“herd immunity”)8 and could substantially lower incidence in nonimmunized children and adults. The effect of herd immunity may be reflected in the unprecedented decline in hepatitis A rates observed during a 6-year-long demonstration project of routine childhood hepatitis A immunization in a county in California.36 Similarly, coincident with the implementation of routine hepatitis A vaccination in selected US states, the number of hepatitis A cases fell from 30 021 in 19972 to fewer than 12 000 in 2001,3 the lowest incidence since the Centers for Disease Control and Prevention began collecting these data in the 1950s. Because of the cyclical nature of hepatitis A incidence, ongoing surveillance is needed to confirm the impact of childhood hepatitis A immunization on overall incidence.

## APPENDIX: MATHEMATICAL DETAILS OF THE MODELING

### Probability of Jaundice Model

The probability of jaundice in a person of age *a* years, *P*_{j}(*a*), was modeled in 2 steps. First, the probability of jaundice in adults, *P*_{jmax}, was estimated on the basis of data collected from people who were 18 years and older. Exact 95% CL of this estimate were calculated by standard methods. Data from people who were aged 0 to 17 years were then used to model the age-specific probability of jaundice in children using a function adapted from Edmunds et al37: *P*_{j}(*a*) = *P*_{jmax} · (1 − exp[−*r* · *a*^{s}]). The values for *r* and *s* were determined by fitting the model to the data with maximum likelihood methods.

The upper CL of this model was estimated by setting *P*_{jmax} to its exact upper 95% CL, reestimating the values of *r* and *s* using data from people who were aged 0 to 17 years and taking the upper 95% CL of this model. Similarly, the lower CL was then estimated by setting *P*_{jmax} to its lower limit, repeating the regression, and taking the lower 95% limit.

### Force of Infection Model

Force of infection, λ (*a*,*y*), was assumed to vary by age, *a*, and year, *y*. Consequently, the prevalence of anti-HAV in US-born people, *P*_{v,u}(*A,Y*), was dependent on *A*, the age of the person at the time of the survey, and *Y*, the year in which the survey was performed. Prevalence and incidence can be related as follows10,38:
\batchmode \documentclass[fleqn,10pt,legalpaper]{article} \usepackage{amssymb} \usepackage{amsfonts} \usepackage{amsmath} \pagestyle{empty} \begin{document} \[P_{v,u}(A,Y){=}1{-}\mathrm{exp}\left[{-}{{\int}_{0}^{A}}\ {\lambda}(a,\ y)da\right]\] \end{document}

The integral on the right side of this equation, the “cumulative force of infection,” can be written as a function of *Y*, *A*, and *a* alone, because *y* can be expressed as a function of these: *y* = *Y* − *A* + *a*.

We modeled λ (*a*,*y*) as a product of 2 functions, λ (*a*,*y*) = *F*(*a*) · *G*(*y*), where *F*(*a*) is a function describing the age-specific force of infection curve and *G*(*y*) is a multiplier that accounts for underreporting and declining incidence over time. The age-specific force of infection curve, *F*(*a*), was calculated as *F*(*a*) = *I*(*a*)/{*P*_{j}(*a*) · [1 − *P*_{v,t}(*a*)]}, in which *I*(*a*) is the average annual reported incidence of hepatitis A from 1980 to 1999, *P*_{j}(*a*) is the modeled probability of developing jaundice, and *P*_{v,t}(*a*) is the prevalence of anti-HAV infection in the US population. The time-dependent multiplier, *G*(*y*), was modeled as an exponential function, *G*(*y*) = β · exp[−δ · (1990 − *y*)], in which β is the underreporting factor, δ is the rate of decline in HAV infection incidence over time, and 1990 is the midpoint of the reporting period for the incidence data used in the model. Because β and δ could not be estimated simultaneously, δ was determined empirically, by varying the δ over a large range of values while estimating β by maximum likelihood regression. The values for β and δ that gave the best fit of the model were chosen. We also applied an alternative function for *G*(*y*), in which changes in incidence were modeled as a polynomial of the year (*G*[*y*] = β_{0} + β_{1} · *y* + β_{2} · *y*^{2} + β_{3} · *y*^{3} + …). This alternative model gave very similar results to the exponential model. Therefore, only results of the exponential model are shown.

Infections in children who were younger than 1 year were treated differently than infections that occurred at other ages for 3 reasons. First, the presence of maternal antibodies makes infection less likely in this group and also creates the potential for misdiagnosis. Second, the reported incidence of hepatitis A in children who were younger than 1 year was higher than one would expect given the trend in older children. Third, the estimate from the probability of jaundice model was particularly unstable in this group. Therefore, we set the force of infection in this group to 0.5 times the force of infection in 1-year-olds and examined the effect of this assumption in the sensitivity analysis.

### Sensitivity Analysis

We tested the sensitivity of the incidence model to 3 of its inputs: the prevalence of anti-HAV, the input from the probability of jaundice model, and the assumed rate of decline (δ) in force of infection. First, 3 sets of incidence estimates were made using 3 different curves from the probability of jaundice model: the best fit regression line and the lower and upper CL. We then tested the sensitivity of each of these 3 estimates to δ by increasing or decreasing the value of δ by 0.003. To account for uncertainties as a result of sampling in the prevalence data, we used Faye’s variation of the balanced repeat replicate method13 to generate 84 pseudoreplicates of the prevalence data. For each pseudoreplicate, we estimated incidence as described above, estimating the parameters β and δ independently. The variances of the incidence estimates were then calculated and adjusted to account for Faye’s *K*.13

To test the sensitivity of the model to assumptions about incidence in children who were younger than 1 year, we varied the force of infection from 0 to 1.5 times the force of infection in 1-year-olds.

## REFERENCES

- Copyright © 2002 by the American Academy of Pediatrics