The Pollsters Got It Wrong When Trump Took on Hillary in 2016. Can You Trust Them This Time?

trump flag pollsters 2016 2020
Pollsters were wrong in 2016, predicting that Hillary Clinton would beat Donald Trump. Will they do better in 2020? MANDEL NGAN/AFP via Getty Images

It's poll season!

Over the next eight months we'll see hundreds of horse-race polls between President Donald Trump and presumptive Democratic nominee Joe Biden. Here's a tip. Don't pay too much attention. The big miss we saw in 2016 can happen again—the coronavirus pandemic could make this election more unpredictable—and there's only so much pollsters can do about it.

Leading up to the last election, President Trump dismissed polls unfavorable to him as fake news. He was right. Polls missed big in 2016. Pollsters believe they know what went wrong and have fixed it. After the 2018 midterm elections, the industry declared victory when pundits using polling data called most races correctly. Harry Enten of CNN trumpeted, "2018 was a very good year for polls." Really? Many of the contests were in deep red or blue areas where the outcome was never in doubt. And there were still some big misses. Polls again underestimated Republicans in a handful of states including Florida and, as in 2016, those misses were enough to result in narrow wins in important races. In all, only 80 percent of polls showed the eventual winners leading. That sounds good, but take out the no-brainers, and the hit rate is more like 50–50—in other words, a coin flip.

So far in 2020, poll performance is mixed. Polls missed in South Carolina. They said Biden was ahead by an average of 15 percentage points and he won by 28 percentage points, although perhaps that's understandable given the rapid consolidation as other candidates dropped out and the huge endorsement from Representative James Clyburn of South Carolina, the third-ranking Democrat in the House. Polls also missed the rise of moderates and fall of progressives, first evident in New Hampshire. Polls were close in Florida, but underestimated Vermont Senator Bernie Sanders in the Democratic primary in Michigan, as they did in 2016.

Those who make forecasts have also missed. Less than a month ago FiveThirtyEight, which focuses on statistical analysis of politics and other key issues, said that Sanders was in "the driver's seat" and "easily most likely to win the Democratic nomination." So much for that.

The first signs that something was screwy in 2016 occurred during the primaries, when New York Senator Hillary Clinton, who led the polls in Michigan by an average of 21 points according to RealClearPolitics, lost to Bernie Sanders by a point and a half. It's been called one of the biggest misses in polling history.

It should've set off alarm bells. It didn't because primary polling is notoriously volatile—lots of candidates, quickly shifting preferences and uncertain turnout. Also, state pollsters often work with smaller budgets than national pollsters, and therefore use less expensive methods like robocalls and online polls. Some surveys call only landlines, and according to USA Today, 80 percent of people aged 25 to 34 don't even have one. So Sanders' stunning victory was shrugged off as an anomaly. In fact, the Michigan primary results were the first signs of a problem that showed up big time come that November—a lack of enthusiasm for Clinton among key Democratic constituencies: youth, African American voters and under-educated whites.

Then came November 8, 2016. Virtually no one picked Trump to win. FiveThirtyEight collected 1,106 national polls in the year leading up to the election. Only 71 showed Trump ahead at any point during the year. Even the Fox News polls and the Trump campaign's internal pollsters expected a Clinton victory. One of the few polls that did show Trump ahead was the USC Dornsife/Los Angeles Times poll, and that one got the winner right but the vote count wrong. The academic responsible, Arie Kapteyn, director of USC's Center for Economic and Social Research, said he'd actually expected Clinton to win.

Partisans on both sides were angry. Republicans believed Trump's victory validated their concern that polls were biased. Shocked Democrats felt set up. Much of the anger was directed at pollsters. Some wondered if the polls might somehow have even changed the result, sort of like the observer effect in physics. Did the polls make Democrats overconfident? Did they let their foot off the gas in states like Wisconsin? Did disaffected Bernie voters cast their votes for Jill Stein or even Trump in protest, thinking it wouldn't matter? Was the low turnout from some groups—for example, African American voters—because they thought Clinton had already won? Courtney Kennedy of the Pew Research Center says, "I used to brush off the observer effect question. I think about it differently after 2016. I think about the people who stayed home. I no longer dismiss that idea."

Academics who studied the election believe that's exactly what happened. One of the authors of a 2019 study, Yphtach Lelkes, assistant professor of communication and political science at the University of Pennsylvania, says, "Even though a traditional poll may say that a candidate is going to win only by a few points, let's say 52–48, the equivalent probabilistic outcome may be a 70 percent chance that the candidate will win. People perceive this as a sure thing. They even conflate percent chance with margin-of-victory and think that the candidate is going to win 70–30. When people perceive the outcome to be a sure thing, they think their vote won't matter. They become complacent, and, our experiments show, fail to vote.... We also found that Democrats were more likely to consume probabilistic polls and that the effect is bigger when a person's favored candidate is ahead."

For pollsters, 2016 posed an existential question: If polls can miss badly and maybe even change the result, can they be trusted? That question was an icy dagger to the heart of an industry that takes itself very seriously. It led the industry association, American Association for Public Opinion Research or AAPOR, to conduct an extensive analysis of 2016 election polling to find out what went wrong. The result was a defense of the industry and polling.

It says, in essence: We didn't really get it wrong. The national vote estimate said Clinton was ahead by about 3 percent, and in the final tally she won the national popular vote by 2 percent. That's one of the most accurate since 1936. And if we did miss, it wasn't our fault. A lot of people made up their minds at the very last minute. And if it was our fault, we're not in the prediction business anyway, so you can't hold us accountable. And it won't happen again. We will tweak our methodology, and it'll be fine.

That's hooey. They did get it wrong. Getting the national vote right doesn't matter because that's not how we elect presidents. That's about as useful as dreaming last week's winning lottery number. Because of the electoral college system, the ones that matter are the state polls. They were bad in 2016. Also, protestations by pollsters that they're not in the prediction business are as disingenuous as the Psychic Power Network's claim that readings are "for entertainment purposes only." Opinion polls and election forecasts are joined at the hip. Even if pollsters themselves refrain from making predictions, polling data is a primary input for those who do. But the most important question is: Can pollsters fix it so it doesn't happen again?

The answer is a resounding "Maybe."

One of the problems identified in the AAPOR report is something pollsters call weighting. Pollsters use arithmetic to adjust their samples to reflect what they believe the relevant population looks like. In other words, although most of us think of polling as science, there's subjectivity involved. According to Pew's Kennedy, who also led the AAPOR review, weighting is tricky business. "We try to find those factors that most explain human behavior. Age. Sex. Race. Rural vs. urban. In 2016, most of us had education on the list. But not everyone did. In red states that might not have mattered, but it did in the Midwest."

In hindsight, the miss with regard to better-educated vs. less-educated voters was, if not an excusable error, at least an understandable one. The report found that in 2012, the less educated and the highly educated voted similarly, so in 2016 some pollsters didn't split them out. Nathaniel Rakich, elections analyst at FiveThirtyEight says, "A really big gap opened up between educated and non-educated and some polls didn't weight by education." The AAPOR report said the Democratic advantage among those with only a high school degree or who did not graduate from high school was around 20 percent in 2012. That group went for Trump in 2016. Because highly-educated people are more willing to take polls, the agglomeration made Clinton appear stronger than she was.But what was most interesting was what AAPOR didn't find. Many pollsters, including the highly regarded Robert Cahaly of the Trafalgar Group—the only pollster to show Trump with a lead in Michigan and Pennsylvania in 2016, according to RealClearPolitics—believe that some supporters are reluctant to admit they're for Trump. AAPOR found no evidence of "shy Trump" voters.

FEA_Polling_06
Comey Effect The debate continues around the impact of then-FBI Director James Comey's last-minute announcement to review new evidence in the Clinton email probe. Sean Proctor/Bloomberg/Getty

Even more controversially, AAPOR dismissed the "Comey Effect." AAPOR found that 13 percent of the voters in Florida, Pennsylvania and Wisconsin decided in the final week and broke heavily for Trump, just after FBI Director James Comey announced a review of new evidence in the Clinton email probe. But they argued that the impact dissipated before the election. It found an "... immediate negative impact for Clinton on the order of two percentage points. The apparent impact did not last..." It concludes that the erosion of support for Clinton began around the 24th or 25th, before the release of the letter.

Pew's Kennedy says, "I spent a year looking at those data six ways to Sunday. I didn't see strong evidence. I think those headed to Trump were headed that way anyway. But we don't know."
Jill Darling, survey director of the USC Dornsife/Los Angeles Times poll, disagrees, "We absolutely saw the Comey Effect." Because her group uses a panel, that is they survey the same people over and over instead of new people each time, they can see when people change their minds and ask them why. Nate Silver, founder of FiveThirtyEight, analyzed the data after the election and concluded the "Clinton lead cratered after the Comey Letter."

A year after the election, Sean McElwee (then, a policy analyst at the think tank Demos; now, co-founder and executive director of Data for Progress), Matt McDermott (a senior analyst and now vice president at Whitman Insight Strategies) and Will Jordan (a Democratic pollster) came to the same conclusion in a piece written for Vox: "The Comey effect was real, it was big, and it probably cost Clinton the election." The Vox analysis found media coverage shifted radically after the Comey letter, both in tone and content. Coverage of Clinton became far more negative and Trump's more positive, and the email scandal crowded out the accusations that Trump had touched multiple women inappropriately.

Those weren't the only two controversial non-findings to come out of the AAPOR study. The committee of 13 heavy hitters in the polling field also found no "polling modality" effect. That is, they concluded that online polls and automated phone calls were roughly as accurate as in-person calls to a random sample of landlines and cellphones, what the industry calls RDD (Random Digit Dialing.) An RDD survey can cost up to $100,000. By using opt-in internet polls or robo-calls and automated responses, the cost can be shaved down to $10,000 or even $5,000. As a result, there are a lot of cheap polls and few where actual humans talk to a random sample of the population.

Jon Krosnick, professor of political science, communication and psychology at Stanford University, independently analyzed 2016 election polling results and came to a very different conclusion than AAPOR. His team looked at 325 polls conducted during the last week of the election. They found only 21 "gold standard" polls, which is what Krosnick calls those that use person-to-person RDD calls to landlines and cellphones. Only one of those was in a battleground state, conducted by Quinnipiac in Florida from November 3–6. (Although, ironically, it showed Clinton ahead.) In those that used automated or non-random methods, his team found errors of around 5 percentage points, although some were as high as 17 percentage points. For the RDD polls, his group found an error of less than 1 percentage point. His conclusion is that in 2016, "Polls using scientific methods did great."

What now? Pollsters say they've improved their methodology so that 2016 won't happen again. In 2018 Scott Keeter, senior advisor to Pew Research Center, surveyed a number of prominent pollsters to ask if they'd changed methodologies in light of 2016. His paper said, "Facing the growing problems that confront all of survey research as well as public skepticism about polling that followed the 2016 presidential election, polling practitioners have examined their methods and many have made changes."

For most, that means tweaks. The smaller miss in this year's Michigan primary could mean they've succeeded. Rakich says, "Hopefully [state-level] polling will do better this year." Pollsters have made no changes to deal with another Comey-like surprise, but they don't believe they need to. When asked about the possibility of future campaigns creating last-minute events to sway the public, Kennedy says, "Twenty years ago a manufactured effect might have made a difference but now, with polarization, people are so tied to political tribes and so skeptical that it would not necessarily have any effect."

Krosnick is less optimistic that the problems of 2016 will be fixed: "Unless people are willing to spend money on better public polls in 2020, we're not going to be any better off. There's no statistical manipulation by aggregators like FiveThirtyEight that can turn a sow's ear into a silk purse." In other words, although some forecasters like FiveThirtyEight give what they consider better polls more weight, Krosnick believes there just aren't enough good polls out there.

It's not likely that additional spending will happen, because the kinds of polls Krosnick is talking about are getting ever more expensive due to declining response rates. According to Pew, in 1997 one in three people would take a phone survey. Now a surveyor has to call 15 phones or 15 times to find a person that will talk. Many people won't even answer their cellphones from a number they don't recognize.

Pollsters and those who use polling data have an obvious motive to argue that the problems can be fixed. But here's the reality. Even if pollsters had corrected for all of these possible issues—under-sampling of less educated and young voters, more polls after the Comey letter, an adjustment for "shy Trump" voters and had they used "gold standard" polls, there's still no guarantee they would have gotten it right. Some elections are simply too close to call. It's possible the problem wasn't with how pollsters poll, but rather with what we expected from them.

Roughly 129 million votes were cast in the 2016 election, but the election was decided by 78,000 voters in three states—Michigan, Wisconsin and Pennsylvania. That's 0.6 percent of the votes cast in those states. Even the best polls typically have an average absolute margin of error of one percentage point. In other words, we asked pollsters to predict heads or tails and got angry when they couldn't.

Pollsters and forecasters understand the limitations of what they can and can't do. That's why forecasters couch their predictions in terms of probabilities or odds. But humans don't think in those terms. Sean Westwood, assistant professor of government at Dartmouth College and lead author on the study cited above, says, "....the research shows it is nearly impossible to convey polls in a way that does not confuse the audience." Rakich says, "We told people that Clinton had about a 70 percent chance to win. If we'd run the election three times, she'd have won twice." We ran it once and she lost.

The company he works for, FiveThirtyEight, was one of the few forecasters to even give Trump a chance. To compensate for the uneven quality of individual polls, they used sophisticated models incorporating polling with other data. But even they miss sometimes. In the 2018 midterms, FiveThirtyEight predicted 506 races. Their best model predicted 97 percent of races correctly—and roughly 90 percent of the really competitive ones. But that's still 16 contests that it got wrong. Going forward, is 97 percent good enough? Probably. Unless the miss is the presidency.

After 2018, pollsters are feeling pretty good about themselves. Courtney Kennedy says, "The public reaction in 2016—that polls are garbage—was understandable but wrong. National polls in 2016 and the 2018 midterms showed it was an anomaly. Polls are still valuable."

Maybe, but pollsters shouldn't relax too much. Because more big misses are coming. Even if 2020 is better, there's no guarantee 2022 or 2024 won't serve up a shocker. Dr. Natalie Jackson, director of research at PRRI says, "We are going into this cycle with an unprecedented level of uncertainty, especially with coronavirus. Predicting the outcome is going to be perilous and fraught with complications. We might not know the answer until we have the election."

Maybe the real lesson of 2016 isn't for pollsters or forecasters, but for the public: Avoid putting too much credence into polls in tight races, even if they say what you want them to. You have to, no matter what, vote.

→ Sam Hill is a Newsweek contributor, consultant and bestselling author.