 Size and Significance
One of the big problems with significance testing is that as the size of the sample you are analyzing increases, the size of the difference or correlation that the test considers significant decreases. It is not difficult to choose a sample that is large enough to make some relatively trivial differences statistically significant.

It is good practice to define the size of the effect you want to detect with your statistical test. Effect size is usually defined in standard deviations, but best understood in percentiles.

Percentile ranks are statements of the percentage of a set of scores which fall below a given score. The median score is at the 50th percentile. A score at the 60th percentile is higher than 60% of the other scores, and so on. To show how percentile ranks can be used to evaluate effect size we'll look at a simple evaluation model.

Let's suppose that you are evaluating a program to improve ability in mathematics. You take two groups of people with similar scores on a mathematics test. One group you give the mathematics program, while the other group gets a program unrelated to mathematics. Then you give them another test in mathematics. It turns out that the group which got the mathematics program improved significantly on the second mathematics test, while the other group didn't.

Let's suppose the increase in mathematics scores in the group that got the mathematics program was a quarter of a standard deviation. That is equivalent to the average person in the group moving from the 50th to the 59th percentile. A difference of a third of a standard deviation would take the average person to the 63rd percentile. On the other hand, you might with a large enough group find a significant difference of only a sixth of a standard deviation, which is equivalent to the average person moving to only the 57th percentile.

This type of comparison can also be adapted to comparisons between groups. For example, if you compared the test scores of two groups of students and found a significant difference of a sixth of a standard deviation, that difference is equivalent to a difference of 7 percentile ranks between the average scores in the two groups. A difference of a quarter of a standard deviation is equivalent to a difference of 10 percentile ranks, and so on.

Consideration of the size of effects following a test can save you from attaching too much importance to small effects. Consideration of effect size during sampling can keep your test from detecting minuscule effects at all.

Size and Significance © 2000, John FitzGerald