Book Text

DATA DREDGING AND THE PROBLEM OF MULTIPLE COMPARISONS

In all our statistical interpretations regarding the acceptance of data-determined conclusions, we always recognize that there is the random chance of a highly unlikely statistical result being misinterpreted as the typical true conclusion. A coin tossed six times and coming up all heads might lead us to conclude the coin was not fairly balanced, yet that will happen every 2⁶ = 64 times that it is attempted even with an entirely fair coin. When we pick an alpha value of .05, we are roughly saying that about 5% of the time we will accept misleading random errors as though they were true. This is roughly equivalent to recognizing that for every 20 experiments we do, perhaps 1 will be wrongly accepted as true based solely on bad luck and random chance. If we analyze a large set of data in many ways, some random results may appear statistically significant just based on this sort of chance. This is the error of "data dredging."

In a similar manner, if a very large number of independent comparisons are made between two groups of subjects and we accept an error rate of 5%, we might get a false, but statistically "significant" result about once for every 20 comparisons made. This is the error of "multiple comparisons." Mathematical methods are available to correct for these errors—what is most relevant for the consumer of the statistical analysis is to recognize the presence of the error and look for the correction.