 The Data Talk Some More
In an earlier article we saw how a simple statistical manipulation can clarify data. Since I wrote that article a couple of weeks ago the second graph from the article has been nagging at me. The graph was of changes in z-scores (standard scores) based on the Dow Jones Industrial Average and the Nasdaq Composite index. In the graph the trends appear to be nearly mirror images of each other (to see the graph, click here), but the correlation between the two indices was only moderately negative.

There are a couple of spots in the graph where the two lines seem to moving in the same direction, but it's difficult to tell how strong the relationship is in those spots. Luckily, the transformation of the indices to z-scores allows us to inquire further into the nature of the correlation.

If there is a perfect negative correlation between two variables, then the sum of any pair of their z-scores will be 0. I therefore summed the z-scores for the two variables. I also calculated the 95% confidence interval of the total, which is the range within which 95% of the scores should fall (it turned out to be between 1.59 and -1.59). I then plotted the totals and the confidence interval in a graph you can see by clicking here.

As you can see from the graph, the big departure from negative correlation was in late March, when an upward spike took the sum of the z-scores outside the confidence interval. If we omit that spike from the calculation of the correlation coefficient, we get a coefficient of -.84, which explains 71% of the variance, considerably more than the percentage explained by the correlation reported in the original article. We seem, then, to have identified the most serious departure from negative correlation.

And once again we have seen how simple analytical principles can clarify and simplify relationships.

The Data Talk Some More © 2000, John FitzGerald