Kamala Harris Leads Donald Trump: How Polling Methodologies Shape the Narrative

Sep 28, 2024

Nate Cohn, chief political analyst at the New York Times, recently shared an insightful piece on the current state of polling in his newsletter, The Tilt. He points out that online surveys are less reliable than traditional polls. The key difference lies in how each type gathers its sample. Traditional surveys, often conducted through live phone interviews, randomly select households to ensure a representative sample of a target population. In contrast, many online surveys are opt-in, making it harder to achieve random sampling. Although various techniques are employed to mitigate this issue, Cohn, referencing a Pew Research Center study, emphasizes that online polls tend to face more data quality concerns. For instance, he highlights the challenge of identifying “bogus respondents” in online surveys.

It is important to recognize that not all online surveys are the same. Many are highly reputable and respected within the industry. As Cohn points out, YouGov’s opt-in panel, for example, requires participants to answer a set of demographic questions, which allows the firm to match its sample to the target population. For its political polls, which it conducts for outlets like CBS News, The Economist, and Yahoo! News, YouGov adjusts its data by weighting it against U.S. Census data and past election results. Other organizations, such as the Pew Research Center, NORC, and SSRS, also collect data online but use a probabilistic approach. These firms randomly invite a large group of people to join their panels and then randomly select members from those panels to complete their surveys, ensuring a more representative sample.

The rise of online polls has largely been driven by their ease of administration and lower cost compared to traditional methods. Conducting online surveys is quicker and less expensive, allowing researchers to reach large audiences with minimal resources. In contrast, traditional polls, while still considered more reliable by many experts, have become increasingly difficult and costly to conduct in the modern era. Fewer people are willing to answer phone calls, especially from unknown numbers, and the growing reliance on cell phones—often without corresponding landlines—has made it harder to randomly sample households. As a result, traditional methods are also facing challenges, particularly in maintaining representative samples. These limitations have contributed to the growing adoption of online surveys, despite their concerns about sample quality and data reliability.

This is an important discussion, but it is not the main focus of this post. My objective is to put Cohn’s claims to the test by analyzing how various survey types reflect American voters’ attitudes toward the 2024 U.S. presidential election. ABC News/FiveThirtyEight’s polling database categorizes surveys based on their sampling methods. I will focus on polls conducted between August 1, 2024, and September 27, 2024, specifically looking at the head-to-head race between Vice President Kamala Harris and former President Donald Trump.

Sampling Methodologies

During the study period, there were 404 polls in FiveThirtyEight’s database, comprising 228 national polls conducted in August and 176 in September. The following graph summarizes the 14 methodologies used in these polls. It is worth noting that the database includes 30 polls that lack any information on their sampling methodologies. These were filtered out from this analysis.

This graph captures two critical facts. As Cohn’s article suggests, most firms are using online panels over other methodologies. Second, many pollsters are experimenting with different methodologies to adapt to a changing market. Thus, some, like the Marist Poll, have combined live phone interviews with alternative approaches, such as text-to-web and probability panels, commonly employed in online surveys.

In this analysis, I will compare polls that utilize online panels, probability panels, and live phone interviews. After the election concludes, we can assess which polls most accurately predicted the winner and determine which sampling methods were the most effective.

Online Panels

My subsetted database includes 270 polls that used online panels. As noted above, not all polls are the same. Some, like YouGov, are highly regarded. Others, however, have more questionable reputations, as Cohn argues in his piece.

The graph below shows the two-way race between Harris and Trump as captured by polling firms that use online panels.

Each point on the scatterplot represents the level of support for each candidate in a specific poll. In addition to individual data points, the chart includes a LOESS regression trend line for both Harris and Trump, providing a smoothed visualization of the overall trends in voter support over time. This helps to highlight any emerging patterns or shifts in voter preferences that may not be immediately visible from individual polls alone.

If we average the polls for the last 10 days, Harris leads Trump by 4 points, 51% to 47%.

Probability Panels:

In my dataset, there are 34 surveys conducted using probability panels. Similar to online panel surveys, respondents complete the questionnaire online. However, pollsters using probability panels go a step further by randomly inviting a large group of people to join the panel. From this pool, they then randomly select members to participate in each survey, ensuring a more representative sample of the population.

The next graph shows how probability panels view the state of the presidential race.

While Harris still leads Trump, this graph shows their support has declined since early August. The 10-day average shows Harris leads Trump by 5 points, 49% to 44%.

Live Phone Polls:

Only 18 surveys were conducted using live phone interviews. Although these traditional polls represent a smaller portion of the overall dataset, experts generally regard them as more accurate. This is one reason why the results from live phone polls often receive significant media attention.

Compared to the other two graphs, the next one reveals a more competitive race.

While Harris maintains a slight lead over Trump in these polls, her advantage is within the margin of error. On average over the last 10 days, 47.5% of American voters support Harris, while 46.5% back Trump.

Which Set of Polls Is Correct?

While these polls offer a snapshot of the presidential race, they are not all of equal quality. Analysts at FiveThirtyEight and polling experts such as Nate Silver have dedicated significant effort to evaluating each poll and its methodology. In FiveThirtyEight’s polling database, each pollster is rated based on “empirical accuracy” and “methodological transparency.” The highest-rated pollsters score closer to 3, while those with less reliable methods score closer to 0.

In my subset of polls archived in FiveThirtyEight’s database, the average pollster rating is 2.07. The next plot illustrates the distribution of pollster ratings across various methodologies used in the analyzed surveys. By comparing these distributions, we can identify trends in how different polling methods are ranked by FiveThirtyEight in terms of accuracy and reliability, highlighting the variability in ratings among different approaches.

The histograms in the plot are organized alphabetically. When interpreting these graphs, it is important to consider the number of polls represented in each graph. The plot reveals that polls employing live phone interviews generally receive higher ratings when compared to those relying solely on conventional online poll techniques. An exception to this trend is observed with probability panel polls, which achieve higher ratings.

Next, let us examine the top four methodologies represented by the most polls in my dataset.

The density plots are arranged from the methodology with the lowest ratings to those with the highest. Given that polls based on online panels tend to score below the average, should we dismiss these in favor of surveys that use live phone interviews, probability panels, or hybrid approaches that combine live phone interviews with other methodologies?

Dismissing surveys that rely on online panels is not a wise strategy. As previously noted, many experts assert that some of these surveys are reliable. Furthermore, Cohn’s critique does not apply to YouGov, which FiveThirtyEight ranks as one of the top four polls, receiving a high rating of 3.

One viable approach is to aggregate all the polls, regardless of their numeric ratings, and compute an overall average. This method allows us to consider the full range of polling data, potentially uncovering broader trends.

In this context, a 10-day average indicates that Harris leads Trump 49.7% to 46%.

Alternatively, we can filter out any polls that received a rating lower than 2 in FiveThirtyEight’s assessment of pollsters. This method emphasizes our analysis of higher-quality polls, ensuring that only those with a proven track record of reliability contribute to our findings. By concentrating on more reputable surveys, we can gain a clearer understanding of the prevailing sentiments among voters, free from the noise created by less trustworthy data.

When we average the results of the last 10 days of these filtered polls, we observe that Harris leads Trump by 3 points, with support levels of 49.1% compared to Trump’s 45.6%. This nuanced perspective provides a more robust insight into the dynamics of the race, highlighting the importance of quality in polling data.

Concluding thoughts:

Which approach is right? There is no clear answer. Polls serve as snapshots of voters’ attitudes, reflecting opinions at a specific moment in time, much like the insights shared by Cohn regarding the reliability of various polling methodologies. When we select one method over another, we may inadvertently introduce bias into our results. There are compelling reasons to aggregate the polls without considering methodological rigor, as this can reveal overarching trends in voter sentiment. Conversely, focusing exclusively on polls with a stronger historical reputation or more rigorous methodologies may provide a clearer picture of the electorate’s true preferences.

As Cohn’s analysis suggests, the landscape of polling is evolving, particularly with the rise of online surveys. In examining the current state of the race, we see that Harris leads in all scenarios; however, the quality of the polls significantly influences these results. According to FiveThirtyEight’s pollster ratings, higher-rated polls tend to show a tighter race between Harris and Trump compared to less reputable polls. This indicates that while Harris maintains an advantage, the margin of support may be narrower than suggested by lower-quality surveys.

Ultimately, the choice between these approaches hinges on the goals of our analysis. If we aim to capture a broad spectrum of voter opinions, aggregating all polls could be beneficial. However, if our objective is to ensure the reliability of our findings, prioritizing higher-rated polls may yield more trustworthy insights. As the 2024 election approaches, the ongoing discourse around polling methodologies will continue to shape our understanding of the political landscape, emphasizing the complexities involved in interpreting voter sentiment and the necessity for thoughtful analysis.

Carlos’s Substack

Discussion about this post