This Article
Right arrow Extract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Services
Right arrow E-mail this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My File Cabinet
Right arrow Download to citation manager
Right arrowRequest Permissions
Citing Articles
Right arrow Citing Articles via CrossRef
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Silverman, W. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Silverman, W. A.
Related Collections
Right arrow Statistics
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Facebook   Add to Reddit   Add to Technorati   Add to Twitter  
What's this?
PEDIATRICS Vol. 112 No. 2 August 2003, pp. 415-416


COMMENTARY

Replicating Measurements

William A. Silverman, MD

Greenbrae, CA 94904-1947

Three men go hunting for bucks in the forest: an economist, a banker, and a statistician. The economist shoots his arrow first, and it lands 10 feet in front of the buck. The banker then fires his, and it lands 10 feet behind the buck. The statistician then leaps up and cries, joyfully, "We hit it!"1

Despite the most elaborate precautions taken to ensure accuracy, when an object or physical phenomenon in the real world is measured repeatedly, the results are rarely, if ever, completely identical. As Menand2 recently pointed out, when a team of astronomers set out to chart the position of a star, the individual observations almost always vary. "The same problem arises," he noted, "when a single astronomer makes multiple observations of the same star." Because lack of agreement in repeated measurements is ever-present, how do we decide which one in a set of replications is the "true value"? When there is no reason to suspect a systematic influence acting to produce the observed discrepancies, it is fair to assume that each of the individual replications is a fair estimate of the "true value."

Fifty years ago, W. J. Youdon, then a consultant to the US Bureau of Standards, examined the behavior of sets of 3 measurements.3 Because it is unlikely that the individual readings will be obtained simultaneously, he noted, they will probably be recorded in temporal order. Youdon then asked: "In general, over all sets of 3 measurements, how often is the third measurement intermediate between the first and the second?" He then provided a statistical formula to estimate how often this is expected to occur (when each, in a set of observations, is truly independent):

"If (n – 1) measurements are followed by an nth measurement, the chance that this measurement falls between the smallest and the largest of the (n – 1) measurements is (n 2) ÷ n."

We should expect, therefore, that in only one third of sets of 3 measurements, the last measurement will fall between the first 2. Two times out of 3, on average, the third measurement will be smaller than both or larger than both, in the first pair. And, once in 5 times, the tenth measurement will be either smaller or larger than the previous 9. Not surprisingly, most people question the "reliability" of a tenth measurement that falls outside the range of 9 earlier values. However, Youdon cautioned, "Once in 5 times is hardly a rare event, and it would be unwise to suspect the tenth [seemingly aberrant] measurement on this ground."

The role of random variation must be kept in mind, even when duplicate measurements are made simply for reassurance that no gross error was made the first time. Seventy years ago, Pearson4 wrote that "in the normal course of events, 5% of paired measurements turn up with a difference between the measurements that is 2.45 times as large as the expected difference between members of a pair." There is no way, Pearson warned, "of merely by looking at widely separated results, to tell which one of the two measurement is really at fault, if a slip has been committed."

When Youdon’s article appeared, Weiss5 wrote a letter to the editor supporting the points made about sets of measurements, and voiced additional cautions because, he noted, "there are 3 factors involved in every measurement; the object to be measured; the measuring instrument; and the person who makes the measurement." Only the complete coordination of all 3 influences will furnish completely satisfactory results. Try as we may, Weiss argued, it is impossible to eliminate our subjective expectation of the "proper behavior" of repeated measurements of the same object or phenomenon. "Psychological factors may play a greater part in the accuracy of our measurements than we are generally inclined to admit," he wrote. In the example of a set of 3 measurements and readings made in a time sequence, "we may assume," Weiss continues, "that the first reading will condition the observer toward the second one, and the second one toward the third [measurement]." And it is reasonable to assume that each subsequent replication will be made with greater accuracy than the preceding one, by way of exclusion of errors during performance of the preceding operations. In the real world, Weiss pointed out, it is virtually impossible to fulfill the requirement that each, in a set of measurements, is truly independent.

Nonetheless, practical efforts to achieve independence (eg, by masking the prior measurement, or requiring separate observers masked to results of others) are worthwhile if we want to use estimates obtained by Youdon’s predictive formula. As Salsburg6 recently emphasized: "Whatever we measure [in the real world] is really part of a random scatter, whose probabilities are described by a mathematical function, the distribution function."


    FOOTNOTES
 
Received for publication Feb 20, 2003; Accepted Feb 20, 2003.

Address correspondence to William A. Silverman, MD, 501 Via Casitas, Apt 421, Greenbrae, CA 94904-1947. E-mail: fumer{at}aol.com


    REFERENCES
 TOP
 REFERENCES
 

  1. Osborne L. Sample error. The New Republic. February 1, 1999
  2. Menand L. The Metaphysical Club. New York, NY: Farrar, Straus and Giroux; 2001
  3. Youdon WJ. Sets of three measurements. Scientific Monthly.1953; 76 :143 –147
  4. Pearson ES. Duplicate measurements. Biometrika.1932; 24 :404 –406[Free Full Text]
  5. Weiss FJ. Sets of three measurements [letter]. Scientific Monthly.1954; 78 :56
  6. Salsburg D. The Lady Tasting Tea. How Statistics Revolutionized Science in the Twentieth Century. San Francisco, CA: WA Freeman; 2001

PEDIATRICS (ISSN 1098-4275). ©2003 by the American Academy of Pediatrics

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Facebook Facebook   Add to Reddit Reddit   Add to Technorati Technorati   Add to Twitter Twitter    What's this?



This Article
Right arrow Extract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Services
Right arrow E-mail this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My File Cabinet
Right arrow Download to citation manager
Right arrowRequest Permissions
Citing Articles
Right arrow Citing Articles via CrossRef
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Silverman, W. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Silverman, W. A.
Related Collections
Right arrow Statistics
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Facebook   Add to Reddit   Add to Technorati   Add to Twitter  
What's this?