Monday, January 2, 2012

Web Search Volumes and the Iowa Caucus

For the last couple of years, I have been musing about the meaning of web searches as an indicator of affective attachment to individual politicians and causes. My first cut at the problem was the belief that web searches are indicative of support. (In fact, Sylvia Manzano and I have an article forthcoming in Political Communication that argues as much.)

Taking that at face value for the moment, this figure shows the weekly Google Insights index scores, based on web searches originating in Iowa during 2011, for the five front-running candidates for the Republican presidential nomination.
In these data, Ron Raul has developed a commanding lead. For the last week of 2011, the candidates' Insights scores are:

Ron Paul---100
Mitt Romney---30
Rick Perry---23
Rick Santorum---22
Newt Gingrich---21

Since the Insights scores are proportional, we can translate them into fractions of overall "support." Assuming that the other candidates in the Iowa race will get about 10% of the Caucus vote between them and that Google Insights scores are directly comparable to levels of electoral support (which seems like a crazy assumption to make, but I am rolling with it for now), these data indicate vote percentages of:

Ron Paul---45%
Mitt Romney---14%
Rick Perry---11%
Rick Santorum---10%
Newt Gingrich---10%

Transforming the Google Insights scores' transformation into percentages of support draws some sharp contrasts with conventional polling of "likely" Caucus goers. Public Policy Polling of North Carolina, which seems to have the freshest poll heading into tomorrow night's Iowa Caucus, reports:
The Republican caucus in Iowa is headed for a photo finish, with the three leading contenders all within two points of each other. Ron Paul is at 20%, Mitt Romney at 19%, and Rick Santorum at 18%. Rounding out the field are Newt Gingrich at 14%, Rick Perry at 10%, Michele Bachmann at 8%, Jon Huntsman at 4%, and Buddy Roemer at 2%.
Both the poll and the web search volumes (treated as support) put Paul and Romney in the first two slots, but the poll shows the two much closer together and shows them running evenly with Rick Santorum and with Gingrich and Perry trailing a bit more.

Looking at the poll numbers and the Google Insights scores side-by-side leads me to think that, at least in the context of elections, web searches may be more usefully thought of as indicators of "stimulation" (in the Rabinowitz and MacDonald sense) than of support. Stimulation, of course, may result in support when candidates and potential voters are on the same side of the status quo policy (so long as a candidate preserves some adequate acceptability), but it can also yield opposition when candidates and potential voters are on opposite sides of the status quo.

This directional voting interpretation makes some sense in the case of Ron Paul. His strong, libertarian-ish stands have won him intense support in some quarters of the Republican Party while engendering intense opposition from other parts of the Party. Likewise, many regard him as an unserious candidate, beyond a region of acceptability, and therefore shy away from supporting him despite a policy-based inclination to do so.

In any event, moving ahead, if Ron Paul outperforms the polls and wins Iowa by any sizeable margin, we will have one more bit of evidence indicating that web searches can be a useful leading indicator of political support for cases in which traditional polling is difficult or unavailable. If not, we will have an additional reason to rethink what web search volume means, perhaps along the lines suggested above.

  1. Interesting analysis with which I mostly agree. However, you are missing two important aspects which have to be considered in order to evaluate whether Google Trends/Insights (GTI) can be used to forecast an election and how to use it:

    1) the number of voters involved : in my experience, at least 1 million voters are needed to have some reliable data to work with

    2) the internet penetration: at least 50% is needed to have some reliable data to work with

    Depending on the situation you have with the previous two parameters, the quantitative methods to be used with GTI dare are different. I refer to this article titled "GOOGLE TRENDS/INSIGHTS (GTI) AND POLITICAL FORECASTS: THE PROBLEM OF SELF-SELECTION BIAS AND THE FUTURE PROSPECTS" for a description of all the cases:

    In the spefific case of Iowa, you have a good internet penetration (79% according to the last data available) , but a very low number of voters (in 2008, 119188 people voted at the Republican caucuses). This put us in the case 3) of the previous article: any quantitative analyses has to be considered with great care and mostly to have an idea of the underlying trends, rather than attempting to have precise numerical forecasts. Unfortunately, given this low number of voters, the problem of the self fulfilling bias can be very large.

    Regards, Gigi B