Secrets of Weight
Elsewhere on this site I disparage the common practice of weighting data, so I decided to mention here that there are occasions when it's acceptable. Those occasions are ones on which you're using a stratified sample.

A stratified sample is one in which some subgroups of the population are over-represented, and the rest consequently under-represented. For example, if you're investigating differences between men's and women's opinions in a population comprising 1,000 men and 2,000 women, you might want the numbers of men and women in your sample to be equal. You therefore draw 100 men and 100 women, or 10% of the men but only 5% of the women. If there are differences between men's and women's opinions, then the results for the entire sample will not be an accurate reflection of opinion in the entire population.

Stratification is sometimes necessary to ensure comparability of subgroups. I recently stratified a sample from a population of people who live throughout Ontario. The sample was stratified because regional comparisons are being made. To ensure that the sample of people in less populous Northern Ontario was large enough for the statistical tests to detect a difference between them and people in other regions of the province, a much higher percentage of people from the north was sampled.

Another reason that samples are stratified is to improve accuracy. Accuracy of estimate can be improved by using the appropriate ratios of sampling percentages in subgroups. Using the appropriate ratios is important – not every stratified sample will improve accuracy.

Once the data are collected you then weight the data to get descriptive results for the entire population. For example, in the example of the comparison of men's and women's attitudes, you would give the women's data twice the weight of the men's in estimating the percentages of people in the entire population holding different opinions.

One thing you cannot do, though, is apply statistical tests to weighted data. For example, you could not pretend in comparing men's and women's attitudes that there were actually twice as many women as men, convenient as the increase in the power of the statistical test might be. PC Database Fitness, part 2 explains in common sense terms why you can't do that. Statistical tests must be performed with unweighted data.

If you are examining the effects in the general population of a variable other than the stratification variable, you should assess the interaction of that variable with the stratification variable. For example, if you want to assess the relationship of income to opinion in the population from which the sample of men and women was drawn, you must be sure that the relationship is not different among men and among women.

And of course you should never use weighting for descriptive purposes without assessing accuracy of estimate. To use an extreme example, if you had a sample of two men and 1,000 women, you wouldn't be able to put much trust in the descriptive results you got by multiplying the men's results by 500.

Secrets of Weight © 1999, John FitzGerald