(the first version of this article was originally published in the spring issue of my newsletter, which is why I refer to spring training in the first sentence)
Spring training has begun, which means we don't have long to wait for the Annual Onslaught of Bogus Baseball Statistics. Among my favourites are the records for post-season play. During the baseball playoffs, a record is broken about every fifteen minutes. As you might expect, most of the post-season records are held by players who have been in the game since 1969, when the number of post-season series was increased from one to three. Now that there are five post-season series, every fall the records tumble like pigs off pogo sticks.
Then there was baseball's contention, which it maintained for 37 years, that 61 is less than 60. Roger Maris hit 61 home runs in 1961, which would appear to be more than the 60 Babe Ruth hit in 1927. Not in baseball, though. Ruth's record remained in the books because he played a season of 154 games, while Maris played a season of 162. That made Ruth's 60 bigger than Maris's 61, apparently. The fact that Maris played against the best players of all races, while Ruth played only against the best players who weren't black, doesn't seem to have counted in Maris's favour.
However, the most bogus statistic of all is the Law of the All-Important First Run. This law is also prominent in hockey, where of course it is known as the Law of the All-Important First Goal. I believe it is against Canadian law for a televised hockey game to be completed without the announcer mentioning, somewhere amid his (sic) endless recitation of players' hometowns, that getting the first goal is all-important, since the team that gets the first goal wins such a high percentage of games.
This belief seems to have come from a study of all major league baseball games between 1966 and 1987 which found that 66% of the games were won by the team that scored first. Thatís an interesting finding because in baseball the visiting team is more likely to score first (since it bats first). However, the home team was still more likely to win, so the importance of the first run was still questionable. In 1998 Tom Ruane published an article in which he showed that teams scoring the first run were less likely to win than teams who were the first to score each of the second through ninth runs. The first run, it seemed, was actually the least important run to score. How can that be?, you may be asking. How can a run associated with 66% of victories be unimportant?
The reason itís unimportant is most likely that the winning team scores more runs than the losing team. Consequently, itís more likely to score the first run. So even if scoring the first run has no effect on the chances of winning a game, the winning team is still more likely to score the first run.
To examine this possibility I chose data from another sport in which teams donít alternate offensive and defensive sessions. I collected scores from 110 National Hockey League games played from November 30, 2006 to December 14, 2006. I included games settled by shootout, but gave no credit to the winning team for the goal awarded for the shootout. The team scoring the first goal won 70% of these games (77 of the 110). However, the winning team also scored 68% of the goals (439 of 649), so if scoring the first goal did not improve a teamís chances of winning a game, youíd still expect the winning team to score the first goal in 68% of the games, or 75 games.
Well, the team scoring the first goal won 77 games, which is more than 75. However, you donít expect that teams scoring 68% of the goals will necessarily win exactly 68% of the games. Other factors have some effect on the outcome, so youíd expect them to win a number around 75. Fortunately, we can estimate the probability that if scoring the first goal does not increase a teamís chances of winning and if winning teams score 68% of the goals then the team scoring the first goal will win 77 games.
That probability is 44%. Conventionally, we would reject the idea that the first goal is of any importance when the percentage is that high. However, arguing that the probability of the difference being real is still greater than 50% is entirely reasonable. However, if we look at the difference that way, we still have to conclude that there is only a 56% chance that scoring the first goal increased the likelihood of winning a game, and that if it did increase the probability of winning a game, it increased it by only 3%. Either way, that first goal doesnít seem all that important.
I propose an alternative to the Law of the All-Important First Goal/Run. I modestly call it FitzGerald's Law: the first team to score the winning goal will win. My law has as much explanatory value as the Law of the Fatal First Goal/Run, but is logically more elegant. It also reminds me of another statistical topic which baffles me: why, in a baseball game which finishes with a score of 11-10, can the player who drove in the first run for the winning team get credit for the game-winning RBI? Hm?