Research, analysis, evaluation

Taking a Fit

Last week I said we'd look at some fine points of standardization this week. These points concern the shape of the distribution of data.

Standard scores are intended for what are called normal distributions. A normal distribution has several defining characteristics, but the key one for our purposes is that its mean and median are the same. That is, the number of scores greater than the mean is the same as the number below it. If this condition is not satisfied, standard scores may be misleading.

Of course, in a sample you don't expect the numbers above and below the mean to be exactly the same. The trick, then, is to be able to decide whether or not any disparity between these two numbers is due to random sampling error. Inferential statistics provides a way to do this.

In fact it provides two. The first is the coefficient of skew (skew is simply the condition of the mean being different from the median), which comes in many varieties, but whose most useful varieties aren't all that easy to calculate. The second is a standard test for what is called goodness of fit. That is, this test tells you whether your sample of data fits a model based on relevant assumptions. The relevant assumption here is that half the cases are above the mean and half below.

Let's suppose you have 100 test scores. Fifty-nine are higher than the mean, and forty-one lower. Are the mean and the median really different, or is the difference just due to random sampling error? If the mean and median were identical, you would expect 50 scores to be above the mean and 50 below. To test goodness of fit, you subtract each of these expected values from the relevant observed value (59 or 41), square the difference (to remove negative signs), divide by the expected value, then add the two results together.

In the example the differences between the observed and expected values are 9 and -9, so the result is (9 X 9)/50 + (-9 X -9)/50 = 81/50 + 81/50 = 162/50 = 3.24. This value of 3.24 is an estimate of a statistic called chi-square, and if we consult a chi-square table we find the probability of obtaining a value of 1 or less is between 7% and 8%. As you will recall from the article about statistical significance, this probability is usually considered too high for the difference to be statistically significant, so, unless the data have some unusual characteristics, we can assume the difference is due to random error and just standardize away.

How big does the value of chi-square have to be? I prefer a value of 6.64 (which has a probability just under 1%), but many people will accept 3.84 (which has a probability just under 5%). I might accept 5% myself if I thought avoiding skew was really important in the specific circumstances. This issue is discussed further in the article about statistical significance.

Ideally the data being analyzed will also cluster around the mean the way they're supposed to in a normal distribution. That is, 68% will be within one standard deviation of the mean, 95% within two standard deviations, and so on. Failure of the distribution to fit this model is not fatal, however. It can prevent you from making useful estimates of probability, though.

Even if the data are skewed or non-normal, standard scores are often more useful than other methods for comparing data.

Taking a Fit © 1999, John FitzGerald

Home page | Decisionmakers' index | E-mail