Wikipedia Attention for the Presidential Elections (Update)

Here’s another update on the analysis of Wikipedia data for the presidential candidates. What’s quite interesting, the attention value vor Mitt Romney is almost at the same level where Barack Obama has been four years ago. And Barack Obama is exactly where John McCain has been 2008:

Obama vs. Romney 2012 compared to Obama vs. McCain 2008 (Wikipedia data)
Obama vs. Romney 2012 compared to Obama vs. McCain 2008 (Wikipedia data)

But one thing has changed: The elections as such are much more interesting to the Wikipedia users than they were 2008:

US Presidential Elections 2012 vs. 2008 (Wikipedia, daily visits)
US Presidential Elections 2012 vs. 2008 (Wikipedia, daily visits)

2012 there is no pre-ballot gap as there has been four years ago.

Why the 2012 US elections are more exciting than 2008

Here’s an addition to my last post on using Wikipedia data to analyse attention for the US presidential elections 2012. Here’s another look at the interest not for the candidates’ Wikipedia pages but the general pages for the elections 2008 and 2012. Compared to the candidates’ pages, the attention for the general election page is much lower than for the candidates. Here’s the average values for October 2012:

  1. Mitt Romney (2012): 98,138 Views / day
  2. Barack Obama (2012): 63,104 Views / day
  3. United States presidential election, 2012 (2012): 38,770
  4. United States presidential election, 2008 (2008): 27,907

This monthly average hints at the 2012 elections being very exciting as the general election pages on Wikipedia have seen a 39% traffic increase compared to last elections. This also hold for the following time-series:

US Presidential Elections 2012 vs. 2008 (Wikipedia, daily visits)
US Presidential Elections 2012 vs. 2008 (Wikipedia, daily visits)

While the attention for the election pages in 2012 did not reach the level it had during the 2008 primaries, from mid October the 2012 campaigns were much more interesting according to the Wikipedia numbers. In 2008 we have seen a drop in attention before election day, in 2012 the suspense seems to build up.

Wikipedia Attention and the US elections

One of the most interesting challenges of data science are predictions for important events such as national elections. With all those data streams of billions of posts, comments, likes, clicks etc. there should be a way to identify the most important correlations to make predictions about real-world behavior such as: going to the voting booth and chosing a candidate.

A very interesting data source in this respect is the Wikipedia. Why? Because Wikipedia is

  1. a) open (data on page-views, edits, discussions are freely available on daily or even hourly basis),
  2. b) huge (WP currently ranks as #6 of all web sites worldwide and reaches about a quarter of all online users),
  3. c) specific (people visit the Wikipedia because they want to know something about some topic)

The first step was comparing the candidates Barack Obama and Mitt Romney over time. The resulting graph clearly shows the pivoting points of Obama’s presidential career (click to zoom):

Obama vs. Romney 2009-2012 (Wikipedia data)
Obama vs. Romney 2009-2012 (Wikipedia data)

But it also shows how strong Mitt Romney has been since the Republican primaries in January 2012. His Wikipedia page had attracted a lot more visitors in August and September 2012 than his presidential rival’s. Of course, this measure only shows attention, not sentiment. So it cannot be inferred from this data whether the peaks were positive or negative peaks. In terms of Wikipedia attention, Romney’s infamous 47% comments in September 2012 were more than 1/3 as important as Obama’s inauguration in January 2009.

Now, let’s add some further curves to this graph: Obama’s and McCain’s Wikipedia attention during the last elections:

Obama vs. Romney 2012 compared to Obama vs. McCain 2008 (Wikipedia data)
Obama vs. Romney 2012 compared to Obama vs. McCain 2008 (Wikipedia data)

Here’s another version with weekly data:

Obama vs. Romney 2012 compared to Obama vs. McCain 2008 (Wikipedia data, weeks)
Obama vs. Romney 2012 compared to Obama vs. McCain 2008 (Wikipedia data, weeks)

It’s almost instantly clear how much more attention Obama’s 2008 campaign (in red) gathered in comparison with his 2012 campaign (in green). On the other hand, Mitt Romney is at least when it comes to Wikipedia attention more interesting than McCain had been.

Here’s a comparison of Obama’s 2008 campaign vs. his 2012 campaign:

Obama 2008 vs. Obama 2012 (Wikipedia data)
Obama 2008 vs. Obama 2012 (Wikipedia data)

The last question: Is Mitt Romney 2012 as strong as Obama had been in 2008? Here’s a direct comparison:

Obama 2008 vs. Romney 2012 (Wikipedia data, weekly)
Obama 2008 vs. Romney 2012 (Wikipedia data, weekly)

A side-remark: I also did a correlation of this data set with Google Correlate. And guess what: The strongest correlation of the data for Obama’s 2012 campaign is the Google search query for “barack obama wikipedia”. There still seem to be a huge number of people using Google as their Wikipedia search-engine.

Google Correlate result for the Wikipedia time series "Barack Obama"
Google Correlate result for the Wikipedia time series “Barack Obama”

But this result could also be interpreted the other way round: If there is a strong correlation between Wikipedia usage and Google search queries, this makes Wikipedia an even more important data source for analyses.