Research, evaluation, analysis

Do You Find Statistics Boring?

A widespread belief is that anything associated with statistics is boring. Statistical analysis is considered boring, as is statistical information. The reason is simple.

Let's look at statistical analysis first. Most people find statistical analysis boring for the same reason they would find reading a novel in Bulgarian boring they don't know what it means. Fortunately, it is considerably easier to acquire an equivalent understanding of statistical analysis than it is to acquire an understanding of Bulgarian.

Statistical analysis is simply a systematic form of decisionmaking. The system consists of two parts: a hypothesis and a test. Statistical analysis makes decisions by testing hypotheses to see if they can be disproved.

The hypotheses tested in statistical analysis are always of the same type. They are known as null hypotheses, because they propose either that there is no difference between two sets of observations or that there is no relationship between them.

This type of hypothesis is used because of the difficulty of proving generalizations about observations of the world. In mathematics and logic proofs are possible, but in empirical studies we can only confirm. However, although we can't prove empirical hypotheses we can disprove them, and the null hypothesis takes advantage of that fact.

To use a trivial example, let's suppose your hypothesis is that all cows are black. To test your hypothesis you go looking for cows and the first one you run across is black. Have you proven that all cows are black? Obviously not. You have proven only that one cow is black, which is a long way from proving your hypothesis. However, if the next cow you see is brown, you know with complete certainty that your hypothesis is disproved. So rather than trying to prove generalizations we try to disprove their opposites.

For example, if we're interested in finding out if there is a relationship between students' country of birth and their performance on tests of academic achievement, we use a hypothesis that proposes that there is none.

Having formulated that hypothesis we then collect information which can be used to test it. If we were investigating the relationship between country of birth and academic achievement we'd collect students' scores on tests of academic achievement as well as their countries of birth.

Once we have collected the data we can use a statistical test to evaluate the validity of our hypothesis. The specific type of test used is the test of statistical significance. The logic of statistical significance has been described in another article, so here I'll just note that the a statistical test is a mathematical formula which tells you how likely your null hypothesis is to be true.

To recapitulate, if you were comparing the academic achievement of Canadian students who had been born in Canada with the achievement of Canadian students born in other countries, you'd start by hypothesizing that there was no difference between these two groups.

You'd then collect achievement test data and calculate for each group either an average score or the percentage of students attaining a standard passing their year, for example. Then you compare the averages or the percentages in the two groups.

Even if there is no real difference between the two groups, you'd expect some difference just because of sampling error. You don't expect to get exactly five heads every time you toss a coin ten times, and you don't expect two groups of students to obtain exactly identical scores on academic tests. The test tells you how likely you would be to find the difference you observed between the two groups. If the difference is unlikely, you conclude that the null hypothesis is false that there is in fact a difference. The question of just how unlikely the difference should be is discussed in the article about statistical significance.

Well, that's statistical analysis in a nutshell. It often gets dressed up in very fancy conceptual language, and it often becomes very elaborate, but the logic underlying it is the logic described here.

As for the boredom caused by statistical information, that is largely due to an inability to appreciate or assess the relevance of the information, and that again is often due to a lack of understanding of statistical analysis. Yes, statistical information often is irrelevant, but statistical analysis also provides dependable methods for determining whether the information is relevant or not. There would be far less irrelevant statistical information about if these methods were widely used. In general, statistical analysis is an effective way of clarifying the options facing decisionmakers.

Of course, now that you have an understanding of statistical analysis, I will caution you against trying to do it without any special training. Don't buy some professional statistical software and start running analyses of variance and factor analyses. You may think that I caution you against doing your own statistical analysis because it would cut into my business, but in fact it wouldn't. Statistical analysis is not the most profound or complex field in the world, but then neither is walking the tightrope. If you try to walk a tightrope without training you fall off, and if you try to do statistical analysis without training you usually lose your way quickly, after which you call somebody like me. If you're unlucky you lose your way much later on, with time running out on your project, after which you call somebody like me.

If you want control over your own statistical analysis (a desire I heartily approve of) a better solution is to have somebody like me design a statistical system which staff can be trained in and which will help to make decisions which need to be made repeatedly. Such a system would include not only automated statistical testing (through software or through the design of spreadsheets or databases) but also definition of appropriate procedures for the collection and comparison of data. Staff training in statistical analysis is another way of achieving this goal.

Do You Find Statistics Boring? THERE IS HOPE! © 1999, John FitzGerald

Home page | Decisionmakers' page | E-mail