Let's suppose you want to compare the effectiveness of a training program at two sites. You test trainees at the two sites before they start training, give them the training, then test them again. At Site A trainees' scores do not increase from pre-test to post-test, but at Site B they increase by ten percentage points. Do you conclude that Site B has better trainers?
No, you don't. While people often try to compare pre and post-test performance in this way, difference scores of this type can easily mislead. For one thing, they ignore relative pre-test ability. The results could easily be like those you can see by clicking here. That is, the trainees at Site A could simply have been more competent to begin with, so their trainers had less opportunity to improve their scores. It is quite possible that the trainees at Site A still ended up more competent than the trainees at Site B.
For a fair comparison the two groups should start with the same competence. If random assignment to the two sites is not possible, trainees should at least be matched for ability – that is, they should be chosen in pairs, one at each site, of equal ability. A statistical technique, like repeated-measures analysis of variance, which analyzes the pre and post-test scores rather than the difference between them will also protect you against problems like this.