From the perpetual pit in my stomach, to the sleepless nights, to the eyes bloodshot from peering at endless forecast models and polling predictions, only one thing can be true: the election is nigh upon us. In a time of uncertainty, where opinions and gut feelings seem to dominate, the mathematician in me craves the concreteness of numbers. And polling does just that. It gives me a quantified sense of what the future will look like. And sure, it’s something cozy to wrap myself up in. But what do those numbers mean? What am I really looking at when I see election forecasting?
First, there’s the question of the act of polling itself. Like anytime you gather data for statistical analyses, there’s always a chance for sampling bias. Since it’s impossible to contact every single person in the country, pollsters need to find some representative subset. That is, polls need to use a small number of opinions to extrapolate the national opinion. National polls are typically conducted by phone, and Pew Research reports that — largely due to the disappearance of land lines over the past 20 years — response rates have gone from 36% in 1997 to 9% in 2012. Since landlines can be autodialed while cell phones must be dialed by hand, calling land lines is still the best way to reach the largest swath of people as fast as possible.
This of course brings with it some problem of bias, since the set of all land line owners in the United States is a very particular demographic, not necessarily representative of the country as a whole. But the Pew Research Center says that through careful weighting of poll responses they are able to overcome these biases. So a single poll, which already contains thousands of responses is weighted to correct biases and massaged to give the most accurate picture of national opinion.
But typically the poll results that you see on popular news sites don’t just reflect a single poll, they are often aggregates of the top 5 state and national polls in some cases, to several thousands of polls in other cases. And then the thousands of polls — some of which are more reliable than others — are weighted to reflect their reliability, sample size, and representative regional demographics.
And finally an aggregation of polls becomes and mathematical model when a few more factors are added into the mix. Depending on the agency doing the modeling they will factor in effects like the convention bounce, the shape of the economy, and accuracy in prediction in certain states in past elections. All of these factors come together to build a robust mathematical model to forecast the election.
As one example, The New York Times maintains an active “Who Will Be President?” Scoreboard, comparing their aggregate model will several of the other top forecasts like fivethirtyeight, Daily Kos, HuffPost, and Princeton Election Consortium. And to game things out even further they also have an interactive chart weighing out the probabilities of each candidates path to the presidency based on electoral votes and states that have determined election outcomes in the past.
And then there’s this dimly lit corner of election forecasting that doesn’t rely on polls at all. These are places like the Cook Political Report which produces forecasts based more on reporting trends and expert opinions, or PredictWise which bases its forecasting on a combination of polls and betting markets.
There are a host of quantitative ways to deal with the election right now — and some slightly more qualitative ways as well — but I find the best thing to do is light some aromatherapy candles, immerse yourself in a warm bath of polls and forecasting, breathe deep cleansing breaths, and wait for November 8th.