Update: I just came across this great article explaining Nate Silver’s methodology and why it bothers the professional class so much: “Why Pundits and Politicians Hate NYT Election Forecaster Nate Silver.”
Another good article about statistics and news coverage: “Data, uncertainty, and specialization” by Jonathan Stray at the Nieman Journalism Lab.
Maybe the best analysis comes from Deadspin’s David Roher: “Nate Silver’s Braying Idiot Detractors Show That Being Ignorant About Politics Is Like Being Ignorant About Sports.”
Spoiler Alert: Science and statistics render unfounded opinion as meaningless.
Nate Silver runs the FiveThirtyEight blog, which is now a part of The New York Times. He uses statistical models to attempt to understand what is happening in national elections. Those models put create probable outcomes based upon mathematical measures. (He uses math that is described at the beginning of the book Wisdom of Crowds).
His probability maps have turned pundits apopleptic with rage, such as this article by Dylan Byers that is a line-by-line argument for why journalists who don’t understand math and statistics should avoid writing about such subjects.
While I’m still in the baby stages of learning statistics, this is how I understand the difference between a statistically probably outcome versus a simple mathematic estimation.
1: If you consider the national polls, each candidate has roughly 47-49 percent of the vote. For many, this would be a “statistical dead head,” which means either candidate might win. On the surface, this seems correct.
2: If Ohio and Wisconsin are determined to be “must win” states for each candidate, and one candidate has a higher probability of winning those states based up statistical models that extract signal away from noise in polls, then the 48-48 race (a 50-50 race) is actually not a 50-50 race. It may be (and here I make up numbers) a 68-32 race based upon probable outcomes.
However, this isn’t a prediction of who will win. Here’s why.
3: If you have a 6-side dice, you have a 1-in-6 chance of rolling any number (let’s say 6 in this scenario). If you roll the dice 12 times and never roll a six, the probability doesn’t change. You still have a 1-in-6 chance of rolling a six.
Which means: The outcome is not predicted by the probability. The probability tells us what we might reasonably expect to happen.
So: Even if the candidate with a 32 percent probability wins the election, that wouldn’t change the probable outcome or negate the statistical model behind it because probabilities aren’t predictions.
(There’s another phenomenon called “regression to the mean,” which finds that the more times you run an experiment, the closer you get to the “average” outcome. This suggests that if you increase the number of dice rolls from 12 to 100 (or any number greater than 12), you are more likely to come close to a 1-in-6 outcome ratio. Just as correlation doesn’t equal causation, neither does probably outcome equal observed outcome.)
Samuel Popkin has an article at Salon.com explaining how statistics and probabilities work within Silver’s model.
What’s important, though, is that good journalists (and storytellers) go beyond what simple heuristics, or rules of thumb, when developing a narrative. You must actively fight against the cogntive forces pushing against you (in this case, The Illusion of Knowledge, which suggests that because you can do basic math you can also easily understand more complex statistical probabilities), and find people who help you sift through the noise.
As my graduate school mentor Bill Drummond was fond us telling us: Journalists don’t do math.