In Search of the False
Popular opinion looks to research for proof, but researchers instead are looking for disproof. As another article here points out, generalizations about the state of the world are difficult to prove directly, but they can be disproved much more easily.
If I believe, for example, that all Americans drink their coffee black, observing a couple of Americans drinking black coffee doesn't prove that I'm right about Americans. If I notice a single American ladling cream and sugar into his coffee, though, I know that my generalization was wrong.
Most contemporary research is guided by the criterion of falsifiability developed by Karl Popper. This criterion holds that a scientific theory or hypothesis must be capable of disproof – that is, it must specify the conditions we would observe if it were wrong.
Many beliefs, both popular and professional, fail to satisfy this criterion. For example, these days low self-esteem is popularly believed to result in almost any type of behaviour. One person's lack of interest in others may be explained as a result of low self-esteem, as may another person's desperate need for approval from others.
Such beliefs are of course useless in explaining or understanding what's going on. They can even be harmful. Phenomena like racism are at bottom examples of unfalsifiable theories. To a racist, whatever members of a racial outgroup do is a sign either of their inferiority or of collusion by powerful people to make members of this group look better than they actually are.
Anyway, when you undertake research to test a theory you should be able to describe the results which would show that you are wrong. This is accomplished by explicitly testing the hypothesis that you are wrong.
Such a hypothesis is referred to as a null hypothesis. For example, if we wanted to find out if men were more interested in sports than women, we would test the hypothesis that men and women had exactly the same interest. We might hypothesize, for example, that men and women would spend equal amounts of time attending sports events, watching them on television, listening to them on the radio, and reading about them in the press.
We would observe how much time two samples of men and women spent doing those things, and then use a statistical test to compare the average times spent by the two samples. The statistical test would tell us the probability of observing these average times if our null hypothesis was true. If the probability was low, we would conclude that the samples differed.
You probably noticed that this practical example is more complicated than the simple example of coffee drinking with which we began. That is because, like most hypotheses in social science, this hypothesis was about behaviour in the aggregate rather than behaviour in general. We were hypothesizing that the average man is more interested in sports than the average woman, not that every man is more interested in sports than every woman.
Nevertheless the falsification approach is still preferable for investigating behaviour in the aggregate. If we had wanted to prove that there were differences between men and women, we would have had to specify how big the difference was, for a start. That is a less efficient approach than hypothesizing there will be no difference and then choosing sample sizes which permit the detection of differences of the size we consider to have important implications for practice.