Most people think that the more data they have the better. The statistical analyst thinks otherwise, though. Since most data is not really informative, the statistical analyst wants to use only that data which is informative. To use an extreme example, if you were deciding what car to buy, you wouldn't want information about the average winter temperature in Winnipeg. However, often data seems on its face to be informative, but analysis will show that it's not.
For an example of this I'm using some achievement test scores which have already been mentioned in the article about correlation. These scores were collected in a study I worked on a long time ago, but they have the advantage over more recent data in my possession that they are in the public domain.
Five hundred and forty-eight children in grade 5 completed four tests, of vocabulary, reading comprehension, mathematical concepts, and mathematical problem-solving. The following table shows the correlation coefficients for the relationship between each pair of scores.
Score Reading Concepts Problems Vocab .77 .63 .52 Reading .66 .56 Concepts .70The coefficients in the table are all statistically significant. As you may recall from the article on correlation, you estimate the strength of the relationship between two scores by squaring the correlation coefficient. The result tells you how much you improve your accuracy in predicting one score from another by knowing one of the scores. For example, if you know a pupil's vocabulary score, you can predict his or her reading score with 59% more accuracy than you could if you didn't know the vocabulary score.
So what we can conclude from the table is that to a large extent these four tests seem to be measuring the same thing. A statistical technique called principal components analysis confirms this. What we can do, then, is construct a measure of this common factor. The easiest way to do this is to standardize each score (to put them all on the same scale) and add them up – that produces a more accurate score as random error in the individual scores cancels out.
But can we retrieve the unique information that each test is supposed to be measuring? In theory, we can, if it exists, using multiple linear regression analysis. In practice I couldn't find any. The scores produced by a couple of different approaches were not independent.
So the economical way to use these scores is to scale them as I did before and treat them as a measure of a single factor. Of course, before you drew conclusions from that single score you'd want some idea of what that factor is. For that purpose you'd want to correlate the scores with some other measures of linguistic and mathematical ability – marks, for example, or scores on other standardized tests. If you don't know what the score measures, you obviously will not want to base decisions on it. Well, we still haven't got any useful information out of these data, and the whole process may seem unproductive. In fact, accepting the data as independent measures of independent skills is what would have been unproductive, and even harmful.
Incidentally, without intending to I ended up demonstrating one of the major arguments against standardized testing, so I'd better address that. One counterargument which could be made is that in fact scores on standardized tests of academic achievement are correlated with marks. Using your single score will therefore tell you something about pupils. A variety of this argument is that ideally these scores should be used as checks on teachers' marks, and in fact that is how they are often used. If a pupil is doing poorly in class, for example, but scores well on standardized tests, his or her teacher will assess the validity of the marks the pupil has received for class work.