Principles of Misinformation
As every workplace and everyone in it gets drawn into the electronic web of information which is being woven so tightly around us, we need to remember that much of what appears to be information may not be information at all.
Decisions are more and more based on data. Organizations everywhere are busily compiling databases which can then be consulted in decisionmaking. The problem is that the data may not be good enough for this purpose.
First of all, the data may simply be incorrect. Mistakes may be made on data entry, for example. Data entry without use of a database's verification features, or of any other type of verification procedure, is not uncommon.
Secondly, information may be entered correctly, but just be plain wrong. A common reason for this is misreporting.
For example, when Statistics Canada first included in the census a question asking people if they owned automatic dishwashers, it found that people in Quebec were more likely than the residents of other provinces to report that they owned them.
Statistics Canada did not, however, accept this finding at face value. They compared the English and French versions of the census form, and found that the question on the French form seemed to be easier to misinterpret than the English question. Where the English version listed automatic dishwasher on a single line, the French version listed machine à laver la vaisselle on two lines. Since the first line read machine à laver, Statistics Canada concluded that some people may have read only the first line and reported their washing machines in the space for automatic dishwashers. When they changed the form to avoid this problem, dishwasher parity was restored.
Data collected by questionnaire are often misreported because the questionnaire is ambiguous. This ambiguity is usually due to technical flaws in wording. For example, if a questionnaire asks how long people have worked for a company, and two of the response alternatives are 1-3 years and 3-6 years, some people who have worked three years for the company will check the first alternative and some the second.
Another source of error is simply aging of the data. If you have a database of addresses, for example, some of the addresses in the database are certain to change in a very short time from their entry in the database. Other types of information will also change as time passes: people have more children, obtain more academic qualifications, and so on.
Even when data are accurate, they may be incorrectly analyzed. Many of the articles on this site deal with this problem. Then again, the data may be accurate and they may be correctly analyzed, but they still may not be of much use.
Fortunately, there are many techniques which can be used to deal with these problems. Incorrect data can be rooted out with data cleaning techniques. The adequacy of analysis can be assessed and improved with standard reliability techniques. Finally, the utility of data can be assessed with validation techniques. These techniques are quick and do not add appreciably to the cost of maintaining a database. You can learn more about them in the series here about PC database fitness. If you take advantage of these techniques, you can help assure that your workplace is living in the Information Age, and not in the Misinformation Age.