June 28, 2013

The Geography of #StandWithWendy Tweets

The filibuster by Texas State Senator Wendy Davis on June 25th to block a new piece of legislation that would have resulted in many more restrictions on abortion in Texas brought a lot of attention to the Lone Star state this week. Day-long filibusters, parliamentary machinations, vocal protesters, and changing the time stamps on votes all make for great political theater, even more so as it involves a highly contentious issue and inter-party fights. From our perspective, one of the most compelling elements of this story was the strong response within social media (including Twitter) that this event engendered. In the course of a few hours, tens (or even hundreds) of thousands of tweets were sent using the hashtag #standwithwendy in order to show their support for the senator's efforts.

We collected all geocoded tweets from June 25th and 26th that contained the text "standwithwendy", resulting in a dataset of 3,702 tweets. Although we are primarily interested in the spatial dimension of tweeting activity, the way this event played out over time is particularly interesting. Using our dataset, one can see how this event - or at least its reflection in Twitterspace - started building around 8pm on the 25th and peaking around midnight as the deadline for the special session neared, though it maintained momentum well into the early hours of the 26th when the legislative session was officially declared over and the bill defeated.

Temporal Distribution of Relative Frequency of 
Tweets Containing #StandWithWendy at the County Level
Blue = relatively more #StandWithWendy Tweets; Red = relatively fewer 
Source: DOLLY, n = 3702 #StandWithWendy tweets on June 25th and 26th, 2013; Normalized by the total number of tweets sent during the same time period; The peak is at ~700 tweets right around midnight

Returning to our primary interest in the spatial distribution of tweets, it should come as no surprise that Texas had by far the most tweets, around a thousand in all, or 28.7% of all tweets with the aforementioned hashtag. While Texas is home to six of the twenty largest cities in the US, and thus is likely to have a significant number of tweets based on its population alone, the state is over-represented in the corpus of #StandWithWendy tweets by ~3.5x, relative to its share of the total US population (Texas constitutes around 8% of the country's population), so there is an obvious localizing effect that comes with being the epicenter of this debate. But the phenomenon was far from limited to Texas, with many tweets coming from around the country, though the rest of these tweets much more closely resemble the distribution of population.

Percentage of Tweets by State (blue text) & Location of Each Tweet (pink dot)
Source: DOLLY, n = 3702 #StandWithWendy tweets on June 25th and 26th, 2013; Darker shading indicates greater intensity
The spatial differences are particularly telling when one looks not just at the raw number of tweets, but rather a value normalized by the total number of tweets sent during this time. Doing so allows us to avoid simply highlighting those places with a large number of people by comparing a given place's production of #StandWithWendy tweets relative to its 'usual' tweet output.

The map below shows this normalized distribution. Darker shaded states have relatively more tweets containing #StandWithWendy than the national average, and lighter states have relatively fewer tweets.  The darker the shading the greater the intensity. That Texas remains shaded dark grey in this map is further indication of the above point that its high volume of tweets in this case goes beyond simply its mass of population, while it becomes evident that the large amount of tweeting in California and New York is more dependent on its population than on any unusual interest in the issue by users in those states. South Carolina and Kentucky were the biggest standouts in terms of having relatively few tweets on the subject.

Geographic Distribution of Relative Frequency of 
Tweets Containing #StandWithWendy at the State Level
Source: DOLLY, n = 3702 #StandWithWendy tweets on June 25th and 26th, 2013; Darker shading indicates greater intensity; Normalized by the total number of tweets sent during the same time period
Overall, there seems to be a general pattern of more tweets in the Northeast, Upper Great Plains and West Coast, while states in the Southeast, Mid-Atlantic, Midwest and Southwest have relatively fewer.  But as this is a quick analysis, we'd caution against reading too much into this.

We can also look into the relative amount of tweets at the county level. The map below shows a small section of the country from Texas to South Carolina. One can see that Austin, the location of the state capitol and Senator Davis' filibuster, is very over represented in the number of tweets, as are many other places in Texas. One of the most interesting patterns is within larger metropolitan areas in which the level of tweeting activity around #StandWithWendy varies widely between neighboring counties, as in Atlanta. 

Geographic Distribution of Relative Frequency of 
Tweets Containing #StandWithWendy at the County Level
Blue = relatively more #StandWithWendy Tweets; Red = relatively fewer; White = no tweets; Darker shading indicates greater intensity
Source: DOLLY, n = 3702 #StandWithWendy tweets on June 25th and 26th, 2013; Normalized by the total number of tweets sent during the same time period

June 24, 2013

Mapping Zombies Book Chapter Now Available!

Not often do we get to write about zombies, internet geography, German Wikipedia articles, cats, and goatse.cx all in the same chapter. But that is precisely what we got to do when working on our newest book chapter "Mapping Zombies".

Feel free to download a pre-publication version of the paper below:

Graham, M., Shelton, T., and Zook, M. 2013. Mapping Zombies: A Guide for Pre-Apocalyptic Analysis and Post-Apocalyptic Survival. In Zombies in the Academy: Living Death in Higher Education. Eds. Whelan, A., Walker, R., and Moore, C. Chicago: University of Chicago Press.

Zombies exist, though perhaps not in an entirely literal sense. But the existence, even the outright prevalence, of zombies in the collective social imaginary gives them a ‘realness’, even though a zombie apocalypse has yet to happen. The zombie trope exists as a means through which society can playfully, if somewhat grimly and gruesomely, discover the intricacies of humanity’s relationship with nature and the socially constructed world that emerges from it. In this chapter, we present an analysis of the prevalence of zombies and zombie-related terminology within the geographically grounded parts of cyberspace, known as the geoweb (see also Haklay et al. 2008 and Graham 2010). Just as zombies provide a means to explore, imagine and reconstruct the world around us, so too do the socio-technical practices of the geoweb provide a means for better understanding human society (Shelton et al. 2013; Graham and Zook 2011; Zook et al. 2010; Zook and Graham 2007). In short, looking for and mapping geo-coded references to zombies on the web provides insight on the memes, mechanisms and the macabre of the modern world. Using a series of maps that visualize the virtual geographies of zombies, this chapter seeks to comprehend the ways in which both zombies and the geoweb are simultaneously reflective of and employed in producing new understandings of our world.

June 07, 2013

The Maps of IronSheep 2013

It's been about a month and a half since our IronSheep maphacking event at the AAGs in Los Angeles, but with the end of semester, the Geography of Hate map and a number of other goings on around Floatingsheep HQ, we've been negligent in posting the results. It was another great year, with about 35 participants divided up into seven teams (see below). But we'd like to give a special thanks to Rohit Shukla and Mike Rudis at LARTA for being such fantastic hosts, as well as John Yaist and Tim Flewelling at Esri for providing the resources for some pretty sweet prizes.

For reasons of propriety/reputation (you'll know why when you seem some of the results), we're not releasing the names of who belong to which team….but you know who you are! The rules of the event and the list of data made available is at the bottom of the post in case you are interested in the details.

For the actual maps used in the presentations (albeit cleaned up a bit as we try to run a PG-13 blog) see the powerpoint at slideshare embedded below.

Team Bo Peep: Justin Bieber and p0rn
Team Ewe: Gangs and Gangnam
Team Feta: A Field Guide to Tweeter Types
Team Ram: Using Argentine Racing Sheep as a Peri-Urban Transport System
Team Wool: Hipsters and Lattes
Team Mutton: Exploring the Spacio-cultural dimensions of Furries


  1. We collected all geocoded Tweets in LA county from June 2012 to April 2013 using the DOLLY system.
  2. Keyword topics included a range of cultural, political and activity based indicators within the tweet text.  
    • The full list of terms included "Beer", "wine", "marijuana", "beer pong", "Zombies", "hipster", "traffic", "accident", "surf* AND !web", "beach", "AK47 OR AK-47 OR "AK 47", "AR15 OR AR-15 OR "AR 15", "shooting*", "happy", "sad", "scared", "ghetto", "danger", "korean taco", "foodtruck OR "food truck", "sushi", "burrito", "latte", "hollywood", "celebrity", "actor* OR actress*", "movie star", "screenwriter OR "screen writer", "broken dream", "beiber", "Lindsay Lohan", "Matthew McConaughey", "hippie*", "yoga", "vegan", "organic", "earthquake", "porn or p0rn", "sunny", "the 405", "gangs", "bloods", "crips", "bloods AND !crips", "crips AND !bloods", 
  3. Everyone got the same data and was allowed one special data pull as their “secret sauce”.


  • Sheep come in herds, so work in your group.
  • Come up with an entertaining or interesting question, And answer it with a geo-visualization.
  • Ask a question that will help us save the world. And answer it with a geo-visualization.
  • Bonus point for the gratuitous use of sheep.
  • A series of visualizations would be great.
  • 60 second lightning presentation of your visualizations.
  • Prizes will be award by voting

June 01, 2013

Mapping Controversy in Wikipedia

Wikipedia, the collection of 37 million articles that anyone can edit, is defined by conflict. The ability for anyone to shape this global repository of knowledge inevitable means that we are presented with fascinating, shocking, and often hilarious discussions on the talk pages of articles. Just check out the talk pages of articles about Barack Obama, the Persian Gulf, and Freddie Mercury (or, if you really want to waste an afternoon, dive into Wikipedia's collection of 'lamest edit wars').  

So, a natural question for was whether we can model and map the controversiality of Wikipedia articles. Does controversy have distinct geographies? It turns out that it does.

To quantify the controversiality of an article based on its editorial history, we focused on “reverts”, i.e. when an editor undoes another editor’s edit completely. We counted all of the reverts in the history of every article and gave a higher weight to editors that revert each other repeatedly. To validate everything, we measured the classifier against human judgement. If you want to read more about the method check articles by friends of the sheep here or here

This all allowed us to get a sense of what the most controversial articles in each Wikipedia language editions are.  In English, the most controversial article is George W. Bush, followed by Anarchism, followed by Muhammed. Whereas in French, the top-three most controversial articles are Ségolène RoyalUFOs, and Jehovah's Witnesses (we're certain there are some good jokes hiding in the orders of these lists). For the full list of top-10 controversial articles in ten languages, check out our in press chapter on the topic (or look at the complete lists here and an interactive visualisation of Wikipedia conflicts at this link). But the short version is that at the top of the lists in multiple languages we see articles related to religion, politics, and football; i.e. pretty much exactly what you would expect people to be arguing about.

But what about the geography of these controversial articles in different languages? Where do we see the most controversial articles in different languages? Below is the full list of maps that we created:

What do these maps tell us? First, we see an interesting amount of difference between the various language editions of Wikipedia. Some of the smaller Wikipedias have a high-degree of self-focus in articles that are characterized by the greatest degree of conflict (check out some of Brent Hecht's work for more on this). For instance, we see articles with the highest amount of conflict in the Czech and Hebrew Wikipedias being about the Czech Republic and Israel respectively. 

Even when looking at large languages that are primarily spoken in more than one country, we are able to see that a significant amount of self-focus occurs (look at the Arabic and Spanish maps of conflict for examples of this). 

The interesting exception to this rule is the Middle East. All languages in our sample apart from Hungarian, Romanian, Japanese, and Chinese actually include articles in Israel as some of those characterised by a large amount of conflict. 

Also, worth pointing out is the fact that we see significant differences in the geographic topics that generate the most conflict. The articles in Japanese that generate the most conflict are not only all located in Japan (and are all educational institutions). The Portuguese articles that generate the most conflict are similarly all located in Brasil (the world’s largest Portuguese-speaking nation), with four out of the top five conflict scores being about football teams. 

Within our sample, we actually only see the English, German, and French Wikipedias with a significant amount of diversity in the topics and patterns of conflict in geographic articles. This probably indicates the less significant role that specific editors and arguments play in these larger encyclopaedias. 

Ultimately by visualizing the geography of conflict in Wikipedia, we're able to see both topics that appear to have cross-linguistic resonance (e.g. Arab-Israeli conflict), and those of more narrow interest such as the Islas Malvinas/Falkland islands article in the Spanish Wikipedia.

These maps therefore offer a window into not just the topics that different language communities are interested in, but also the topics that seem worth fighting about.

To read more about conflict and Wikipedia:

Yasseri, Taha, Spoerri, Anselm, Graham, Mark and Kertesz, Janos, (2014) The Most Controversial Topics in Wikipedia: A Multilingual and Geographical Analysis. In: Fichman P., Hara N., editors, Global Wikipedia: International and cross-cultural issues in online collaboration. Scarecrow Press. Available at SSRN.

Graham, M., M. Zook., and A. Boulton. 2012. Augmented Reality in the Urban Environment: contested content and the duplicity of code. Transactions of the Institute of British Geographers. DOI: 10.1111/j.1475-5661.2012.00539.x

Graham, Mark, The Virtual Dimension (2013). Global City Challenges: Debating a Concept, Improving the Practice, M. Acuto and W. Steele. Available at SSRN: http://ssrn.com/abstract=2212824

Yasseri, T., Sumi, R., Rung, A., Kornai, A., and Kertész, J. (2012) Dynamics of conflicts in Wikipedia. PLoS ONE 7(6): e38869.

Török, J., Iñiguez, G., Yasseri, T., San Miguel, M., Kaski, K., and Kertész, J. (2013) Opinions, Conflicts and Consensus: Modeling Social Dynamics in a Collaborative Environment. Physical Review Letters 110 (8).