The current Huffington Post Pollster average has Trump at nearly 36% in the polls nationwide. His closest competition comes from Ted Cruz (14%) and Marco Rubio (12%). The polling average in Iowa has Trump at 28.5%, leading Cruz by about seven points. In New Hampshire, nobody comes close, with Trump polling at 26.8% and his nearest competition, Marco Rubio, at just under 14%.
With numbers like these, how can Trump not roll his way to the Republican nomination?
Or, for those who are terrified of the prospect of a Trump nomination, the more comforting question: Could something weird be going on with the polls?
In 2012, I did a deep dive pre-election on this same question – “Is something off with the polls?” – and it seems appropriate to tackle this question again today.
First, I want to divide the “are the polls right?” debate into two parts: are the polls predictive and are the polls valid.
On the question of predictiveness, the answer at this point is “heavens, no.” Early polls are not really “predictive,” especially not in a volatile race. In one sense, with only a month and a half until Iowa, it feels like we are incredibly close to the first caucus. However, with about half of GOP voters saying they haven’t fully decided on a candidate, this is still a race very much in flux.
When it comes to forecasting and questions like “can anyone take down Trump?” or “will voters change their minds?” I suspect the answer to these is “yes, potentially, though it is far from guaranteed.” Most horse-race polling is not terribly predictive precisely because it gives you a snapshot in time, and that time is still many days away from Iowa. The ground is always shifting beneath us; of course a poll today isn’t “predictive” of what will happen a month and a half from now. (This is to say nothing of whether “national primary polls” mean anything, given that there is no “national primary” at all.)
What I want to explore instead is the question of validity: “Is Trump REALLY the frontrunner, right at this moment? If the election WERE held today, would Trump really win? Are the polls right, and are they a valid picture of this point in time?”
Here are the factors to consider when assessing “are the polls telling the truth about what’s going on in the GOP primary?”
1) Are these polls actually capturing “real” Republican primary voters?
Here’s how most of these national media polls work: they start off calling around a thousand adults across the country. Of the thousand adults who participate in the poll, about 800 or so will report that they are a “registered voter.” Then, those people will be asked to identify with a party, or, if they are independent, who party they “lean” toward supporting. From that, these media polls wind up with a bucket of 300 to 450 people who say they are registered to vote and that they either are Republican or are an independent-leaning Republican.
These are the people who are then asked: “If the election were held today, for whom would you vote?”
The reality is that most of those people will never actually make it to the polls.
Let’s take Florida in 2012, where there were roughly 4 million registered Republicans in the state on the eve of the primary. Only Republicans are allowed to vote in Florida’s closed presidential primary. Fewer than half participated in the state’s presidential primary that year. A poll of all Florida Republican voters would have had a sample where about half of the people who responded never actually showed up to vote.
Now, we do know that people who take polls are slightly more likely to be politically engaged, so the mere fact that someone is actually taking the time to chat with a pollster does mean they are more likely to be an actual, honest-to-goodness likely voter. But still…there are a lot of people responding to the polls who are never going to set foot inside a voting booth come next spring.
The assumption here from those who think Trump’s numbers are inflated is that by casting a very, very wide net with these polls, people who are not committed, “base” Republicans are being swept up into the subsample and that it is those people who are boosting Trump’s numbers. (There is plenty of data to suggest that Trump voters are less likely to vote than others.)
The prescription therefore would be to only survey those who have a proven track record, on the voter file, of participating in things like presidential primaries. (Or, alternatively, a method like the one used by Adrian Gray, someone I consider to be one of the brightest minds on the right, who starts with casting that wide net and surveying all New Hampshire Republicans, but also provides crosstabs of “likely” and “very likely” voters, allowing folks like you and me to see how different turnout scenarios might lead to different results.)
The upside to much more tightly screening respondents is that you weed out people who are unlikely to show up and theoretically hit a target that is much closer to the mark than “let’s just survey everyone and see how it goes.”
But there is a downside to tight screening.
Consider the odd coalition that Trump has assembled; Trump’s support does not appear to stem from any traditional “faction” of the GOP. For instance, he’s not overwhelmingly beloved by the tea party, the very conservative and the evangelicals, who tend to favor Cruz and Carson. Nor is he anything remotely close to an “establishment” pick. His support doesn’t come from any one group we think of as a “likely voter” bloc, but instead comes from people who are less educated and watch more television.
But you do hear the constant refrain from Trump supporters that he’s “energizing them” in a way nobody has previously. Furthermore, the blockbuster viewership numbers for these Presidential debates and the fact that the field is so large, so fractured, and that Trump is drawing so much attention to the contests, could very well mean that our definition of “likely voter” needs a re-think.
In 2008, Barack Obama won the Iowa caucuses in part by re-making the electorate. Screen too tightly for likely voters based on past participation, and you can easily miss out on someone changing the game and bringing new people into the process.
It’s certainly the case that Trump’s supporters don’t necessarily look like “likely voters” as we know them. But it’s not impossible to think that Trump could re-shape what it means to be a “likely voter.” This is, after all, a very strange election indeed.
2) Are people just SAYING they like Trump, but when it comes down to it, even if they go vote, they won’t REALLY choose him in the voting booth?
As we consider this question, let’s assume the people we are surveying are, generally, the right people. We then have the question of whether or not people lie (or, to be more charitable, misrepresent their views), and whether or not people who say they are voting for Trump really do mean it.
This always comes up on questions where there is “social desirability” bias, where respondents feel pressure to give an answer that is considered socially acceptable rather than tell the unvarnished truth about what lurks in their hearts and minds.
Harry Enten at FiveThirtyEight breaks out the polls, both nationally and in early states, and finds that there is a significant difference in Trump’s level of support when polls are done online or via “robopoll” versus with a live-interviewer. We call this a “mode effect,” where the result of a survey seems to be influenced by the method through which it is conducted.
In general, Trump does better when respondents don’t have to actually tell another live human being that they plan to vote for Trump.
Which, then, is the more accurate approach for gauging voter sentiment? It may depend on the nature of the contest you’re polling.
In New Hampshire, voters will disappear into a voting booth and select their choice for the nomination. They won’t have to tell another soul if they voted for Donald Trump. In this case, the fact that respondents in an online poll also get nearly perfect anonymity may more closely mirror the actual act of voting.
But in Iowa, or in other caucus states, the selection process will involve discussion and a public airing of one’s preferences. In that case, telling someone live over the phone how you feel may be closer parallel to the caucus process.
All of which is to say that we may have a few different factors in play here: some people may say they plan to vote for Trump but, in reality, they haven’t had to deeply consider the question and so they just say the last name they heard on the news. However, some people may be afraid to say they support Trump, but in the privacy of a voting booth would make that preference known with their secret ballot.
Trump’s support may be overstated because we’re far enough out that people haven’t thought hard about their vote, and when online, people are more comfortable picking the person who amuses them rather than the person who would be Commander-in-Chief. But Trump’s true support level may be understated by live-interview polls if people feel that voting for Trump is a socially unacceptable behavior they nonetheless plan to engage in.
3) Let’s say we are talking to the right people, and those people are giving us an accurate reflection of their views on the race. What else could undermine the accuracy of these polls?
Horse-race polling is a tough business. The polling and market research industry is enormous, and the vast, vast, vast majority of what it does has nothing to do with the political horse-race, never asks a ballot question, and in a sense is never fully accountable for the accuracy of its results.
I can ask 1000 registered voters if they approve of Barack Obama’s job as president, and there’s really no way I can ever be proven “wrong” or have a “miss” on that question, short of some new technology being invented that allows for the instantaneous mind-reading of every registered voter in America.
But with the ballot, there is a test, a true answer, a final result against which the “accuracy” of polls is judged – fairly or unfairly. And as of late, there has been much hand-wringing about the accuracy of polling, mostly because of “misses” in horse-racing polling. (There’s a reason why pollsters like Gallup are getting out of the “horse-race polling” business entirely.)
The fact of the matter is that good polling is expensive. People aren’t picking up the phone, they’re cutting the landline, and so you have to call dozens and dozens of phone numbers in order to get a single willing respondent. This costs a lot of money to do.
In case you haven’t noticed, media organizations and institutes of higher education are not exactly flush with cash these days. Budgets are tight and the luxury of having tens of thousands of dollars to sink into a large sample is not a luxury many media organizations or university research institutes have. As a result, we wind up with “GOP primary polls” with around 300-450 respondents.
Here’s the problem: that’s not a lot of people. In one sense, it is TOO many – lots of those 300 to 450 people are not actually going to vote in the GOP primary – but in another sense, it is very, very small.
The “margin of error” is one of the most abused and misunderstood concepts in coverage of political polling, but the margin of error for the subsamples we are seeing in these GOP primary polls is usually around +/-5%. (I will spare you all my full rant about the correct/incorrect way to think about margin of error, but if you are interested, subscribe to my podcast The Pollsters, where I go off on this topic on a fairly regular basis.)
That means that, even if everything else in the survey is perfect, and you have the exactly correct random sample frame and a beautiful response rate and everyone is telling you the truth, by simple random chance your result could be “off” by five percentage points, in either direction, for every candidate’s result. (For reasons I won’t go into here, margin of error actually shrinks when you are talking about very tiny sample proportions, so Lindsey Graham is not “1% in the polls but as high as 6% within margin of error,” but I digress.)
The point is, we are talking about a pretty small sample or “n-size” in most of these sub, sub samples. There’s a lot of uncertainty built into these polls from the get-go, simply because sample sizes are fairly small. Small samples, even well done, can still be off the mark due to unavoidable random variation.
I don’t know if Donald Trump will be the Republican nominee.
Every day we get closer to Iowa, and Trump stays well ahead in the polls, I suppose to odds of that outcome do increase. There are also a number of reasons why it might not come to pass.
People could change their minds, Trump could finally say something that pushes his supporters away (I’m not holding my breath), another candidate could become unbelievably compelling after a strong debate performance, a bomb could go off somewhere and reshape our national dialogue, someone’s Super PAC could drop a kajillion dollars onto the airwaves and re-arrange the race, etc. etc. There are a ton of unknowns and fifty-some days until the Iowa Caucuses.
But I also think it is important for us to separate out the “predictive” quality of these horse-race polls (they are not really predictive, despite everyone loving to use them as a forecasting metric) and the “validity” measures that give us reason to believe or doubt polls as they stand today.
When deciding if you “trust” the polls, I would encourage people to stop worrying about whether these polls are predictive, because they really aren’t. I do think we need to be very critical about whether or not these polls are valid measures of this current snapshot in time, and I think there are important questions to be raised on that front.
This article was reprinted with permission from Kristen Soltis Anderson.