Tuesday, January 3, 2012

Google Insights Iowa Caucus Projections: Update Through January 2

Through my new Italian correspondent, I have learned that Google has updated its Insights search index data through yesterday, January 2 for Iowa. So, it's now possible to update the forecast for tonight's Caucus a bit. I will use the same assumptions that "other candidates in the Iowa race will get about 10% of the Caucus vote between them [which I now realize is probably too low] and that Google Insights scores are directly comparable to levels of electoral support, these data indicate the following vote percentages." I have, however, fiddled a little bit with the search terms used for this forecast compred to the ones I have rolled out earlier. Having now observed that some candidates last names alone, Santorum and Gingrich, return both unambiguous search results (e.g. not Katy Perry mixed in with Rick Perry when searching for "Perry") and higher Google Insights scores than the combination of all candidates first and last names (which I had been using in my previous posts).

The Google Insights scores from January 2 and the resulting forecasts are:

Ron Paul---100 (32.5%)
Rick Santorum---75 (24.4%)
Mitt Romney---49 (15.9%)
Newt Gingrich--- 31 (10.7%)
Rick Perry---22 (7.1%)

Subjectively, I think these data are over-reporting Ron Paul's standing a bit, since his supporters tend to be younger who are (probably) more likely to use the internet and under-reporting Santorum's standing since his supporters tend to be older social conservatives who are (probably) less likely to be online. Also, I am pretty sure I have over-stated the support for all of the included candidates since I have low-balled the estimate for "all others" at 10% total, but I can't include them individually since the public Google Insights tool limits users to seeking out five search terms at a time. Still, the data are consistent with my own baseless sense of how the candidates will line up when the votes are tallied: Paul, Santorum, and Romney taking the top three spots, in that order.


  1. What a night!here are the expost summary for my comments and forecasts:

    - it was forecasted to be a dead heat among Santorum, Romney and Paul, and so it was.

    - The correction factor of 0.5 to transform the google data for Paul into real vote was correct

    - The correction factor of 1.1 to transform the google data for Romney into real vote was underestimated (the true one was 1.5)

    - I was expecting more from Ron Paul, but this was not the case. An alarming bell was probably the interview that he gave recenlty where he said that he did not see himself in the oval office: if you go in a race, you fight to win. Anyway, the race has just started,let'see what will happen.

  2. In the end, this caucus confirmed what I said in my previous comments in the past days: when the number of votes is smaller than 1 million, any numerical forecasts with Google data has to be considered with extreme care and with lot of salt, due to the problem of self selection bias.