KEY POINTS
- Always plot your data. Significant trends should be visible to the eye.
- Know your statistics program well enough to be sure that it is calculating
what you want.
- Interval data should not be treated like categorical data—the mathematics
is different.
- Many statistical methods assume that the data are distributed "normally,"
that is, in a symmetric bell-shaped curve; these methods can be misleading if the
data are not normally distributed.
- SD is used to describe the spread of data, and SEM is used to compare data
sets.
- Multivariate regression, which relates the outcome variable to more than
one other factor, requires more data but will probably pick up correlations that
may be missed if only univariate regression is used.
- When using multivariate regression, if two variables correlate closely
with each other, the statistical package may miss reporting one as correlating with
the outcome.
- In hypothesis testing, a negative result may indicate no real difference
or may just mean that the study was underpowered to pick up a true, but small difference.
- A P value is the probability that the observed
result will occur, assuming no true difference between the tested hypotheses, which
is not the same as the probability of the difference being true.
- A bayesian approach to diagnostic testing recognizes the fact that the
value of a test depends on the patient population: if the test is almost always
truly positive in the population, false-negative results will outnumber true-negative
ones and make the test less useful. This is also the situation if the test is almost
always truly negative, in which case false positives cause the confusion.
- Selection biases make many real-life clinical studies difficult to interpret.
Randomized clinical trials are the best way to minimize this problem.
- Beware of the error of "data dredging." Applying too many tests to insufficient
data will probably find something that, misleadingly, seems significant.
|