January 11, 2013

Premier League teams on Twitter (or why Liverpool wins the league and the Queen might support West Ham)

Have you ever wondered where Premier League football teams draw most of their support from? Or what the geography of fandom is? We have too, and set about to better understand how Premiership teams are reflected in Twitter usage across the UK.

The Floatingsheep team, with the help of two researchers from the Oxford Internet Institute - Joshua Melville and Scott A. Hale (both of whom did most of the work) - have created a neat interactive map for you to both explore the geography of Twitter mentions of specific teams, and let you explore the patterns of five key rivalries. Click on the screenshot below to be brought to the full interactive map


The data used include all geotagged tweets mentioning any of the Premiership football teams and their associated hashtags (e.g., #MUFC or #YNWA) that were sent between August 18 and December 19, 2012. We have only included one tweet per user to prevent 'loud' fans from skewing the results. The users were then aggregated to postcode districts in order to see a fairly fine-grained geography of results. The number of tweeters per district is normalized by the total 'population' of Twitter users based on a 0.25% random sample of all tweets within the UK. 

What do the data show us, you ask? In Manchester, for instance, there is the oft-repeated stereotype that Manchester City are the 'real' local team, while Manchester United attract support from further afield. Our map doesn't really support that idea though. There are only a few parts of Greater Manchester in which we see significant more tweets mentioning Manchester City than their local rivals. We also, strangely, see more mentions of Manchester City in Scotland and Merseyside, and more support for Manchester United in Northern Ireland.

The Merseyside rivalry (Liverpool vs. Everton) is another interesting one to map. There we see that Liverpool have the slight edge in the postcode that is home to both team's stadiums. However, there is no clear winner in the rest of the region: with most postcodes having a fairly close split between the two teams. Interestingly, many postcodes in Scotland seem to have more mentions of Everton; while many in Northern Ireland have more mentions of Liverpool.

We can also zoom into particular postcodes and see which teams are most mentioned there. The  academics in Oxford (for some strange reason) mention Manchester City more than any other team. Central Edinburgh (when not focusing on Hearts or Hibs) has more mentions of Everton than any other Club. And the Queen's home of SW1A goes for West Ham.

What about maybe the most important question of all. Who wins the league based on total number of Tweets sent from anywhere in the UK? The answer is Liverpool (a team that hasn't won the actual league since 1990).  Manchester United are a somewhat distant second, joined by Everton and Tottenham in the Champions League spots. We also find out that Fulham, Swansea, and Wigan are the three teams that get relegated due to their quite abysmal scores. Apparently just not that many people want to tweet about Wigan.

There is no doubt that using Tweets as a proxy for fandom is messy and not always reliable. In other words, we are mapping mentions and not measuring sentiment. But, the data do give us a rough sense of who is interested in (or at least talking about what), and where they are doing it from. It allows to begin to counter myths (e.g. that Mancunians don't support Manchester United), develop new insights about places that we don't necessarily have good data about, and most importantly, have some guesses as to which team the Queen might support.

See also:
A broader take on how information augments place (a second paper on the topic can be accessed here)
Other examples of our Twitter mapping (racism, flooding, earthquakes)
The code behind this visualisation (made freely [CC-BY-NC-SA] available on Github)

8 comments:

  1. this need significant improvement.

    you've assumed such tweets with MUFC or YNWA for example are sent by fans of the club. a brief look at twitter will tell you this is fundamentally untrue. users talk, often disparagingly, about other clubs at least as much( or more than) their own.

    this is clearly evident on your map - the sheer quantity of tweets for norwich city emanating from their rivals ipswich for example leaves most of suffolk yellow.

    i would suggest filtering your results into positive and negative tweets. a much harder task. but surely expletive filled phrases phrases like 'liverpool are a joke' or 'leeds are shit' could help you along. or even distinguishing between uses of 'we' and 'they'.

    ReplyDelete
  2. Andy, as you're from Norwich I suspect you're a Manchester City fan. Do you have any relations in Stockport?

    ReplyDelete
  3. Nice idea but there's no way you can use this data to show who from where supports who. For a start there's the problems Andy mentioned above. Also, in only including geotagged tweets, you're excluding a huge amount of tweeters.

    Also by only counting one tweet per user you run the risk of misallocating people into teams they don't support. I'm a Sheffield Wednesday fan but if I watch a game featuring, say, Liverpool, and I'm sad enough to tweet several times about it, do I count as being a Liverpool fan?

    Even by using just Twitter you're excluding a massive section of football fans in the UK who support teams but aren't on Twitter.

    I don't want to appear that I'm just trashing this map for the sake of it - It looks great and with a bit of fine-tuning could really work, but at the moment it stands quite far from where it claims to be.

    ReplyDelete
  4. As said above, mentions of team names and hashtags is an extremely poor indicator of support / opposition to a given team.

    Instead of this fairly naive implementation you should probably look at http://www.sentiment140.com

    ReplyDelete
  5. This is very cool. I'd like to see the geographic distribution of fans normalised by the team's relative popularity. i.e. how likely is it that a team will be mentioned in one place compared to another place. Might be necessary to exclude postcodes that house stadia though, as these will likely produce high peaks (thanks to match day tweets) that aren't to do with the places people are from.

    ReplyDelete
  6. Everton come 1st or 2nd most tweeted about in most of the WF (Wakefield), BD (Bradford), HX (Halifax) and LS (Leeds) post codes. Is this because they are well supported in this area? Or is it because Leeds played and beat them at Elland Road during the time frame of your study and subsequently it will have been mentioned repeatedly around about that time. There are very few Everton supporters compared to the number of Liverpool and Manchester United supporters in that area, even Chelsea or Arsenal would be better supported there.

    ReplyDelete
  7. Good work folks, look's great too. I have to agree with some of the posters - it's a bit of a stretch to assume support, but for interest, it's fair. I have a professional interest, my research team just completed a study of a top flight football competition across 7 countries. To really understand sentiment (and most favoured players, managers etc) we manually coded over 2000 mentions per market. In local language. I'm afraid (and apols to Jimmah) automated platforms like sentiment140 will never give you that level accuracy.

    ReplyDelete
  8. Could you update this, please? There are a lot more Twitter users now than there were in 2012, as you can see from this:

    http://www.folos.im/category/5

    ReplyDelete