August 2017, VOLUME140 /ISSUE 2

Variation in Preoperative Testing and Antireflux Surgery in Infants

  1. Heather L. Short, MDa,
  2. Nikolay P. Braykov, MSb,
  3. James E. Bost, MS, PhDb, and
  4. Mehul V. Raval, MD, MSa
  1. aDivision of Pediatric Surgery, Department of Surgery, School of Medicine, Emory University, Atlanta, Georgia; and
  2. bDepartment of Outcomes and Quality Measurement, Children’s Healthcare of Atlanta, Atlanta, Georgia
  1. Dr Short conducted the conception and design, interpretation of the data, and drafting of the manuscript; Mr Braykov performed the acquisition of data, analysis of data, interpretation of the data, and drafting of the manuscript; Dr Bost performed the interpretation of the data and critically revised the manuscript; Dr Raval conducted the conception and design, acquisition of data, interpretation of data, and critical revision of the manuscript; and all authors approved the final version of the manuscript and are accountable for the accuracy and integrity of the work.


BACKGROUND: Despite the availability of objective tests, gastroesophageal reflux disease (GERD) diagnosis and management in infants remains controversial and highly variable. Our purpose was to characterize national variation in diagnostic testing and surgical utilization for infants with GERD.

METHODS: Using the Pediatric Health Information System, we identified infants <1 year old diagnosed with GERD between January 2011 and March 2015. Outcomes included progression to antireflux surgery (ARS) and use of relevant diagnostic testing. By using adjusted generalized linear mixed models, we compared facility-level ARS utilization.

RESULTS: Of 5 299 943 infants, 149 190 had GERD (2.9%), and 4518 (3.0%) of those patients underwent ARS. Although annual rates of GERD and ARS decreased, there was a wide range of GERD diagnoses (1.8%–6.2%) and utilization of ARS (0.2%–11.2%). Facilities varied in the use of laparoscopic versus open ARS (mean: 66%, range: 23%–97%). Variation in facility-level ARS rates persisted after adjustment. Overall 3.8% of patients underwent diagnostic testing, whereas 22.8% of ARS patients underwent diagnostic testing. The proportion of surgeries done laparoscopically was independently associated with ARS utilization (odds ratio: 1.57; 95% confidence interval: 1.21–2.02). Facility-level utilization of diagnostics (P > .1) and prevalence of GERD (P > .1) were not associated with utilization of ARS.

CONCLUSIONS: There is notable variation in the overall utilization of ARS and in the surgical and diagnostic approach in infants with GERD. Fewer than 4% of infants with GERD undergo diagnostic testing. This variation in care merits development of consensus guidelines and further research.

  • Abbreviations:
    antireflux surgery
    Children’s Hospital Association
    confidence interval
    clinical transaction code
    gastroesophageal reflux disease
    International Classification of Diseases, Ninth Revision, Clinical Modification
    odds ratio
    Pediatric Health Information System
  • What’s Known on This Subject:

    Antireflux surgery (ARS) is performed frequently in infants on the basis of the presence of subjective symptoms without confirmatory diagnostic testing before surgery. There is no high-quality evidence demonstrating positive outcomes in these patients after ARS surgery.

    What This Study Adds:

    This study quantifies the frequency at which diagnostic tests are used before ARS and characterizes the variability in surgical utilization for the treatment of gastroesophageal reflux disease. These data can be used to focus future research related to ARS outcomes.

    Gastroesophageal reflux disease (GERD) is the pathologic passage of gastric contents into the esophagus that results in symptomatology, and it affects ∼7% of infants in the first year of life.1,2 The first-line treatment of GERD in infants is medical management, including mechanical maneuvers, such as thickening of formula feeds and upright positioning during feeding, and pharmacotherapy, including H2-blocking agents or proton pump inhibitors.3 Antireflux surgery (ARS) may be required when symptoms persist despite maximal medical treatment. Currently, there is no established diagnostic algorithm for confirming GERD in children and no consensus on indications for surgical intervention. Additionally, there are no high-quality studies demonstrating the effectiveness of ARS as a treatment of observed symptoms of GERD in infants.4

    Although recurrent aspiration pneumonia is the most common indication for ARS in infants, a review of the literature revealed significant variation in the symptoms and indications, with other common indications, including apnea and/or bradycardia; acute, life-threatening events; bronchopulmonary dysplasia; severe emesis; failure to thrive; and the presence of esophagitis.5 However, many of these symptoms are not specific to GERD and can vary with age and comorbid medical conditions.6 Despite their availability, diagnostic tests are inconsistently used to confirm the presence of GERD, and ARS is often performed on the basis of the presence of these clinical symptoms alone. Without objective measures to guide the decision to perform surgical intervention, variation in care exists. Goldin et al7 found that there is “tremendous variation” between 36 freestanding children’s hospitals in the frequency with which ARS is performed and cited the lack of clear indications for surgery as the cause. We hypothesize that this variation persists in a larger and updated cohort and that the use of diagnostic tests is associated with ARS utilization at the hospital level. The purpose of this study is to characterize national variation in the use of diagnostic testing and utilization of ARS in infants with GERD.


    Study Design and Data Source

    We conducted a multicenter, retrospective cohort study of infants <1 year old with a primary or nonprimary GERD diagnosis who visited US children’s hospitals between January 1, 2011, and March 30, 2015. Data were sourced from the Pediatric Health Information System (PHIS), a database of demographic, clinical, and resource utilization data from 49 freestanding tertiary care pediatric hospitals affiliated with the Children’s Hospital Association (CHA, Lenexa, KS). CHA hospitals are located in noncompeting markets in 27 states and account for 15% of pediatric hospitalizations in the United States.8 Billing information is mapped to a common set of clinical transaction codes (CTCs), including imaging studies, clinical services, laboratory tests, pharmacy supplies, and room charges. Data quality and reliability are assured through joint efforts between the CHA and participating hospitals.9 Records are deidentified, but encrypted identifiers allow for a longitudinal analysis of multiple visits for each patient. Utilization data based on CTCs are available by date of service. The hospitals in this study provided outpatient utilization data throughout the study period. The study was deemed exempt from review by the institutional review board at Children’s Healthcare of Atlanta.

    Study Population and Analysis Variables

    GERD diagnoses were identified by using the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes 530.11 (reflux esophagitis) and 530.81 (gastroesophageal reflux). If multiple visits with these codes were present, we considered the earliest encounter as the index visit. Our primary outcome of interest was receipt of ARS within 12 months of the index admission date. Open and laparoscopic ARS were identified by ICD-9-CM procedure codes 44.66 (Other procedures for creation of esophagogastric sphincteric competence) and 44.67 (Laparoscopic procedures for creation of esophagogastric sphincteric competence), respectively. ICD-9-CM codes and CTCs from all available visits linked to cohort patients, including those before the index visit, were used to determine utilization of relevant clinical, endoscopic, or imaging diagnostic testing before the date of ARS (if ARS was performed) or at any point during the study period (if ARS was not performed). Relevant clinical tests included esophageal pH probes and manometry, relevant endoscopic procedures included esophagogastroduodenoscopy (EGD), and relevant imaging studies included esophageal and/or pharyngeal fluoroscopy, ultrasound, radiograph, planar imaging, tomography, and other gastroesophageal motility studies (Supplemental Table 3).

    Demographic and index visit characteristics included age at time of admission (days), sex, race, payer type, admission through the emergency department, admission to any ICU, transfer from another facility, and year of admission. To measure underlying disease, we used database variables based on the pediatric complex chronic conditions classification system and custom PHIS-specific definitions that flagged the following groups of comorbidities based on ICD-9-CM codes: mechanical ventilation; infection; medical complication; cardiovascular, hematologic, or immunologic; malignancy; metabolic; neuromuscular; congenital or genetic defect; renal; respiratory; very low birth weight (<1500 g) (Supplemental Table 4).

    To capture facility-level differences in the burden of GERD and local differences in GERD management and diagnosis, we abstracted the following hospital-level indicators: census region, proportion of ARS performed laparoscopically, proportion of GERD patients who had any diagnostic test performed, and proportion of patients who had ever had a GERD diagnosis.

    Statistical Analysis

    Univariate analyses with standard descriptive statistics were performed for all variables of interest. A bivariable analysis compared characteristics of individuals who received ARS (laparoscopic or open) with those who did not. We also tested for the association between facility-level rates of ARS and the use of any diagnostic tests, as well as the prevalence of GERD in the infant populations at each facility. χ2 tests were used to compare categorical variables, and 2-sample t tests or Wilcoxon rank tests were used to compare the continuous variables of the 2 groups. Correlation between the 2 continuous variables was measured by using Pearson’s product-moment correlation, or Spearman’s rank correlation (ρ) when distributions were skewed. To model the patient-level probability of receiving any type of ARS, we fit a generalized linear mixed model with a logit link, fixed effects for patient characteristics, and a random intercept for each hospital. The random effect component reflects the degree to which a hospital’s ARS utilization departs from what would be expected once differences in population demographics and morbidity are controlled for and captures the effect of any unmeasured characteristics that influence the outcome. To assess the variability in ARS utilization across hospitals, we compared between-facility variation in the raw (unadjusted) ARS utilization rates with the variation in conditionally-standardized (adjusted) rates predicted by our model. We expected that if the differences across hospitals were because of case mix, then the estimated variance of the random effect term in the adjusted model would be 0. To test the significance of the random effect components in our final (adjusted) model, we use a likelihood ratio test to compare model fit against a fixed-effects-only logistic regression model. Fixed effects in the final (adjusted) model were chosen by using a backfitting procedure that started with a full model, including all considered comorbidity and demographic variables, and then assessed if model fit was improved when a term was removed. Patient age, sex, race, year of admission, government payer status, and census region were forced into all models, as was the hospital-level proportion of laparoscopic ARS and hospital-level utilization of relevant diagnostic tests. All analyses were performed by using R, version 3.1.3 (see Supplemental Information).


    Cohort Characteristics

    Of 5 299 943 infants <1 year old at 45 hospitals in the study period, 149 190 had a GERD diagnosis (2.81%), and 4518 (3.03%) of those patients underwent ARS. In particular, 3235 (2.17%) underwent laparoscopic ARS, and 1283 patients (0.86%) underwent open ARS. Surgical utilization rates decreased over the study period, dropping from 3.24% in 2011 to 2.37% in 2015. A relevant diagnostic test was conducted for 3.82% of all patients with a diagnosis of GERD (1.35% had imaging, 0.34% had an EGD, and 2.31% had a monitoring test). In contrast, 22.8% of patients who required ARS for treatment had at least 1 diagnostic test before surgery. Demographic and clinical characteristics of the cohort are summarized in Table 1. The median age of the cohort was 70 days (interquartile range: 32–144). An ICU admission was noted for 92.78% of patients. Common comorbidities and/or complications flagged at the time of index admission included infection (29.5%), mechanical ventilation (11.59%), cardiovascular disease (9.89%), and respiratory disease (8.52%).

    TABLE 1

    Demographic and Clinical Characteristics of the Cohort at the Time of the Index GERD Visit

    Facility-Level Variation in ARS Utilization and Diagnostic Testing

    There was large variation in the use of ARS and associated diagnostics across the 45 hospitals (Fig 1). The percentage of infants who had any ARS within a year of GERD diagnosis was 3.03%, ranging from 0.19% to 11.7% across hospitals (a 61.03-fold difference). Hospitals also varied in their use of laparoscopic versus open ARS surgery, with laparoscopy utilization ranging from 23.2% to 97.9%. There was also a large variation in the prevalence of diagnostic use (3.82% overall, 0.0%–13.8% range). With few exceptions, monitoring techniques (including pH probes and manometry) were the preferred method (Fig 1). There was no significant association between the facility-level ARS rate and the use of diagnostics (P = .92) or GERD prevalence (P = .43) (Fig 2). However, there was a positive correlation between ARS use and the proportion of surgeries performed laparoscopically (P = .003). Because GERD prevalence and the use of diagnostic tests showed skewed distributions, they were categorized into quintiles on the basis of their rank before inclusion in the statistical model.

    FIGURE 1

    A, Variation in unadjusted utilization of ARS and B, diagnostic tests across 45 PHIS hospitals. Dashed lines show pooled rates when all hospitals are combined.

    FIGURE 2

    Relationship between facility-level ARS utilization, diagnostic test use, and GERD prevalence. The size of the points corresponds to the number of infants discharged from each site. There was no independent or joint association between measures, as tested by ordinary least squares (P > .1 in all models).

    Patient-Level Probability of Receiving ARS Across Hospitals

    The use of any diagnostic test was the covariate most strongly associated with receipt of ARS at the patient level (odds ratio [OR]: 6.24; 95% confidence interval [CI]: 5.66–6.88). The presence of a mechanical ventilator or neuromuscular, congenital, respiratory, or cardiovascular conditions was also positively associated with surgery, as were transfer from another facility, ICU admission, and government insurance (Table 2). Neither the census region nor the facility-level quintile related to use of diagnostic tests was associated with the outcome. However, higher facility-level utilization of laparoscopy was associated with a higher probability of receiving ARS (OR: 1.57; 95% CI: 1.21–2.02).

    TABLE 2

    Covariates in Mixed Effects Logistic Regression Model for Receipt of ARS

    Although the variation in ARS rates declined after adjustment for patient and facility-level factors, large differences across hospitals persisted (Fig 3). For an average patient without complications, and with the use of a diagnostic flag, there was a 33.5-fold difference between the highest (6.05%; 95% prediction interval: 3.82–9.17) and lowest predicted facility-level probability of undergoing ARS (0.181%; 95% prediction interval: 0.104–0.299).

    FIGURE 3

    Variation in adjusted and unadjusted ARS rates. Hollow and filled symbols show, respectively, actual and predicted probabilities of ARS from a mixed effects logistic regression model with random intercept for hospitals. Adjusted predictions are based on population-averaged values in which only the random effect (the hospital) varies, and fixed effects are constant.


    There are significant variations surrounding the diagnostic workup, surgical indications, operative approaches, and outcome measurements in infants with GERD. Our results revealed that <4% of patients with a diagnosis of GERD were subject to a confirmatory test and that as few as 22.8% of patients who underwent ARS received diagnostic testing before surgery. Among all GERD patients, pH monitoring was the most commonly employed test; however, there was significant variation in the frequency and types of diagnostic tests used by hospitals. Surgical utilization rates varied significantly across hospitals and were not associated with local GERD prevalence rates. And although we hypothesized that the use of diagnostic tests at the hospital level would be associated with higher surgical utilization rates, we found no such association. Rather, there appeared to be an association between laparoscopy utilization and the surgical treatment of GERD.

    We chose infants <1 year old as our cohort because of the unique diagnostic challenge this population poses to surgeons. First and foremost, pathologic GERD must be distinguished from normal regurgitation, which is present in ∼50% of infants at 6 months of age. This “normal reflux” commonly resolves spontaneously in the first year of life, never gives rise to GERD, and never requires treatment, medical or surgical.11 Diagnosis is further complicated by the nonverbal status of this patient population. Although diagnosis of GERD in adults relies heavily on symptomology and adequate response to treatments, infants are unable to communicate these findings effectively. Finally, there are age-associated differences in the diagnostic workup and treatment of GERD, with older children (age >6 months) being less likely to undergo ARS in 1 study.12

    The decision to intervene surgically for a medical condition is often based on a definitive physical examination finding (ie, the presence of an inguinal hernia) or a positive diagnostic test result (ie, evidence of appendicitis on ultrasound). The same is not true for the treatment of GERD in children, in whom the indication for surgery is most commonly the failure of medical management.13 The criteria for failure of medical management relies solely on subjective symptoms such as pain; aspiration; acute, life-threatening events; apnea and/or bradycardia; and failure to thrive, none of which consistently result in a positive objective test result. In fact, the authors of 1 study demonstrated that only 52% of children with reported subjective reflux symptoms had a positive pH test result, which is considered the diagnostic gold standard for GERD.14 This is in contrast to the surgical treatment of GERD in adults, for whom the objective documentation of reflux preoperatively is mandatory.15,16 This apparent lack of a standard preoperative diagnostic algorithm implies variation in the indications for surgery and begs the question of whether surgery is being performed on the appropriate patients for the appropriate reasons, resulting in maximum benefits with minimal associated risk.

    Available diagnostic tests are multimodal and include esophageal pH monitoring, intraluminal impedance studies, manometry, endoscopy, and upper gastrointestinal series. The diagnosis of GERD is made when 1 or more of these tests reveal excessive frequency or duration of reflux events, esophagitis, or a clear association of symptoms with reflux events in the absence of alternative diagnoses.3 The authors of several studies have demonstrated the effectiveness of fundoplication in reducing esophageal pH as reported by extended (2- to 24-hour) pH probe monitoring when test results are compared preoperatively and postoperatively, although these studies are limited by incomplete follow-up and patient heterogeneity.1719 With a systematic review, Jancelewicz et al4 determined that pH probe monitoring is the best-studied diagnostic, but the evidence is weak and the test only received a Grade C recommendation. The evidence supporting the use of upper gastrointestinal series, EGD, and impedance studies was insufficient and no recommendation could be made for or against the use of these tests.4 Further research in the form of level 1 trials evaluating the effectiveness of ARS in converting objective tests from a positive to a negative result, indicating the resolution of GERD after surgical intervention, is warranted. Monitoring with a pH probe may be the ideal target for these efforts, given the results of previous studies.

    Most of the existing literature on outcomes in the surgical treatment of GERD consists of retrospective case series in which the documentation of diagnostic testing and the details of previous medical management are lacking.2022 Moreover, many of the studies that describe positive outcomes after ARS use subjective measures of outcome to assess their efficacy rather than objective tests to confirm the resolution of symptoms.21,2325 Although the authors of these previous studies have shown positive outcomes after ARS, the authors of a recent retrospective review failed to demonstrate a decrease in the number of patients requiring hospitalization for pneumonia, respiratory distress and/or apnea, and failure to thrive after ARS was performed to treat those symptoms. More concerning is the finding that 20 of the 24 patients who required admission for aspiration pneumonia after ARS had no history of aspiration pneumonia before surgery. Similar findings were documented for other pneumonia, respiratory distress and/or apnea, and failure to thrive.26 These contrasting results make it difficult to come to a meaningful conclusion regarding postoperative outcomes in children undergoing ARS. The deficiency in high-quality evidence for or against the use of ARS in the treatment of GERD may account for some of the variation in care seen in this population.

    Variation in care is a driver for systemic inefficiency, and reducing variation may decrease complexity of care, encourage adherence to best practices, generate cost savings, and lead to improved outcomes for some procedures.27 To reduce variability in the use of ARS in children, there are several areas for potential improvement and opportunities for further research. First, the indications for surgery need to be clarified. A standardized diagnostic approach and trial period for medical management before surgical referral needs to be established and adopted by gastroenterologists and surgeons. Additionally, high-quality evidence for the use of ARS over medical management in the form of a randomized trial needs to be ascertained, especially in infants for whom GERD often resolves spontaneously over time. Finally, the development of a decision analysis tool for the use of ARS in children would allow physicians to use the best available evidence with probabilities of outcomes to select the best treatment option.27

    There are several limitations to this study. First, the use of an administrative database limited our analysis to only those variables included in the database. Although we were able to determine if a patient had a diagnostic test, we did not have access to the results of those tests. Because of this, we could not determine if a positive result always resulted in surgical treatment. Additionally, although most infants would likely receive surgery at the same institution for the entire continuum of GERD care and ARS, it is possible that diagnostic tests or surgery were performed at other institutions and not captured in the database. Finally, the hospitals included in the PHIS are large, tertiary referral centers, and the data collected from these institutions may not be generalizable to all hospitals, especially smaller community hospitals.


    In a cohort of nationally representative pediatric tertiary care centers, there is notable variation in the overall utilization of ARS and in the surgical and diagnostic approach to infants with GERD. Fewer than 4% of infants underwent diagnostic testing to confirm pathologic GERD, whereas only 22.8% of patients had a diagnostic test before surgery, with significant variation among hospitals. We acknowledge that it is possible that any diagnostic testing performed at an outside facility may not have been captured in this inquiry. However, we anticipate that this situation would have been rare in this patient population because most diagnostic testing in infants occurs at a designated children’s hospital and, therefore, would be captured in the PHIS. This variation in care merits the development of consensus guidelines on the indications for surgical treatment of GERD in infants and further research determining the outcomes after ARS.


      • Accepted May 23, 2017.
    • Address correspondence to Mehul V. Raval, MD, MS, Surgery and Pediatrics, Division of Pediatric Surgery, Department of Surgery, Emory University School of Medicine, Children’s Healthcare of Atlanta, 1405 Clifton Rd NE, Atlanta, GA 30322. E-mail: mehulvraval{at}
    • FINANCIAL DISCLOSURE: Dr. Raval receives intramural funding and support from the Department of Surgery at Emory University School of Medicine and the Emory + Children’s Pediatric Research Center which includes Children’s Healthcare of Atlanta.

    • FUNDING: No external funding.

    • POTENTIAL CONFLICT OF INTEREST: The authors have indicated they have no potential conflicts of interest to disclose.