June 24, 2014

To Bieb or Not to Bieb? The Geographies of Bieber and Miley Fandom

In our continuing effort to use the massive amount of social data available to us in order to uncover unforeseen, unusual and sometimes uninteresting facts about the world around us, we turn today to a question that has long troubled our world (or at least the part of it consisting of fourteen year-old girls): Bieber or Miley? 

While the once (sort of?) innocent teen pop stars have long since grown up, getting any number of ridiculous and ill-advised tattoos, twerking across your television screen and maybe even romancing one another, Justin Bieber and Miley Cyrus remain inextricably tied in the imaginations of those of us who mostly don't really know what's going on with the kids these days [1]. But by firing up DOLLY and looking at the global distribution of tweets referencing one or the other of these music icons, we can see that the two couldn't be more different in their geographic reach.

Our comparison is based on a 10% random sample of all global geotagged tweets between July 2012 and March 2014, which yielded a total of 165,406 tweets referencing "Bieber" and 99,146 tweets mentioning "Miley".

The first thing that's evident from this map is that Justin Bieber is truly "All Around the World", garnering more references to his name than Miley Cyrus' in most of the world's countries. And while Bieber's dominance starts in his native Canada and extends south throughout the Americas from there, Miley Cyrus comes in like a "Wrecking Ball" to have a real "Party in the USA", where she has a nearly 10,000 tweet advantage over the Bieber. Unfortunately for Miley, however, the US is really the only place where she is more popular than Bieber. Indeed, she only has any advantage whatsoever in 45 countries around the world, with most of these clustered in Africa and the Caribbean. Then again, maybe she's just getting "The Best of Both Worlds"?

And while Bieber's advantage extends through Europe and much of Asia, his dominance is actually most deeply rooted in Latin America. The country with the biggest difference favoring Bieber tweets is Brazil, with over 22,000 more Bieber tweets than Miley tweets, even in our limited dataset. This is likely due to Bieber's well-documented risqué escapades in the country. In addition to his absolute dominance in Brazil, Bieber has an advantage of over 1,000 tweets in 18 other countries around the world, from Indonesia, Mexico, Turkey and Argentina at the top of the list, to Sweden, Denmark and Paraguay at the bottom.

Forty countries have no geotagged tweets referencing Bieber or Miley, though many of these are small island nations with very little tweeting activity to begin with. We suspect that there is probably a development grant that these places could apply for to help make them Beliebers.

The most interesting thing is that no country with any significant amount of tweeting about these pop stars displays parity between the two. This leads us to posit that there has been a significant Balkanization of the Biebersphere [2], with no reconciliation between the two opposing poles of over-sexualized, tabloid headline-gracing teen pop stars who are now more known for their distasteful appropriations of other cultural traditions than for actually making music anyone wants to hear. Then again, if you want to get dialectical about it, there's really nothing oppositional about them. Hell, they even twerk together! And by making this map, we've now probably set society back at least a good couple weeks in our arduous process of learning to ignore them. Our apologies. Sometimes, "We Can't Stop" ourselves.

OK, seriously, we're done now [3].

[1] Seriously, turn that music down! And get off of our (virtual) lawn!
[2] If you're wondering why we suddenly decided to invent the term 'Biebersphere' to refer to Twitter, look no further than the fact that Justin Bieber remains arguably the largest single topic of conversation on Twitter. It's frankly sort of amazing how many people tweet about him on a regular basis. And yes, this does utterly depress us about the state of humanity.
[3] Although, "Never Say Never".

June 10, 2014

Crowdsourcing Cake or Death?

Following up on our recent trend of finding inspiration for our maps in various oppositions that we've encountered in our day-to-day lives, we turn today to the seemingly obvious question posed by Eddie Izzard: cake or death? 

While this should be a no-brainer for us, we thought we'd crowdsource the answer to this question, turning to the collective wisdom of the geographically-referenced tweet machine. We draw on a dataset of all geotagged tweets mentioning "cake" or "death" between July 2012 and March 2014 [1]. Given that cake is so much more pleasurable than death, we expected Twitter references to show a similar preference. But the results might surprise you. 

Humans, apparently share a similar fondness for talking about cake and death. Extrapolating from our 10% random sample of global tweets, there are approximately 1,302,310 mentions of cake during this time, as opposed to 1,314,880 mentions of death.

Global Geotagged Twitter References to Cake or Death, July 2012-March 2014

The death loving nations of the United States, Nigeria, Canada, South Africa, and India clearly stand out on the map. Cake, on the other hand, is a much more frequent topic of conversation in the UK and a handful of Southeast Asian countries including Indonesia, Malaysia, the Philippines, and Thailand.

Among countries with a significant number of references to both cake and death, the Mediterranean countries of Lebanon and Greece, along with the Caribbean nations of Trinidad and Tobago and Barbados are the only ones that could be said to have found a nice balance between cake and death.

The real question here is, why do some countries prefer death over cake? It is understandable that Canadians are locked in a deep cake-less existential crisis (we would be too if we lived there), while South Africa has one of the world's highest murder rates. But why is the US so infatuated with death?

Geotagged Twitter References to Cake or Death in the USA, July 2012-March 2014

If we zoom into the world's most death-loving country, death is, well, pretty much everywhere around you. Death to everyone, indeed. In absolute terms, there are a total of 162,205 mentions of death in the US, as opposed to 845,923 mentions of cake, but the geographic distribution of these references is all the more stark and, dare we say it, troubling. If you happen to live in or, god forbid, be passing through the post-industrial towns of Michigan, Ohio or Pennsylvania, or the BosWash megalopolis, death is really everywhere around you. From the frozen tundra of the north to the sunny retirement hotspots of southern California, Arizona and Florida, you can't really escape it.

That is, unless you live in one of a handful of cities or towns smattered throughout the south and Great Plains. If, by choice or extreme luck, you happen to live in Atlanta or in one of several Texas cities -- from Dallas to Waco, down to Houston and all the way to Brownsville in the southern portion of the state -- you may be able to revel in the joy of boundless cake. Given the widespread dominance of death in other places, it is only natural to assume that cake will essentially become so abundant as to be given away for free at all restaurants and grocery stores. May we all be so lucky! [2]

[1] Yes, this is another missed opportunity from IronSheep 2014!
[2] This, of course, doesn't account for the fact that too much cake consumption will likely lead to obesity and then, yes, death.

June 03, 2014

Mapping the Seven Dirty Words

As many of our regular readers know, each year at the Annual Meetings of the Association of American Geographers we hold a map hacking event that we call IronSheep. Modeled after the Iron Chef television show, we provide the 'secret sauce' of a dataset to the teams of contestants who must then concoct a 'tasty map' for the crowd to consume. When putting together the dataset for this year, we consciously embedded the potential for a few different concepts to be explored, but without telling the contestants about these possibilities.

One of these unrealized possibilities, which we bring you today, was a comparison of George Carlin's infamous "seven dirty words". For those of you who are unacquainted, the genesis of Carlin's bit was that saying these words in any context could get one in trouble with the law -- especially if uttered on a television or radio broadcast. But since we're talking about the internet here, pretty much anything goes, as can be seen in the sheer numbers of times these words are referenced in geotagged tweets around the United States. And while we could technically get away with saying these words on this medium, we like to run a family-friendly website and so we'll be using euphemisms for each. Our apologies if you're offended by these words, but this is, after all, science. And for those who absolutely have to see the terms that Carlin referred to as bad, dirty, filthy, foul, vile, vulgar, coarse, in poor taste and unseemly (among many other things), we have included them in the footnotes, with a few selectively redacted letters to lessen the shock [1].

Like the rest of this year's IronSheep dataset, this data is culled from our database of all geotagged tweets from July 2012 through March 2014. In order to stay as true as possible to Carlin's seven dirty words, we didn't include references to derivative words outside the original seven [2].  Even with this restriction we ended up with a total of 43,086,300 references to the seven dirty words which shows how twitter users are just a *un** of foul-mouthed, **x****, ***q**, *****ff** flaming ***d*!!! The list below shows the true magnitude of foul, unholy geotagged tweets (or FUGTs) generated in the United States, with an average of:
  • 2,051,728.6 FUGTs per month
  • 67,533.4 FUGTs per day
  • 2,813.9 FUGTs per hour
  • 46.9 FUGTs per minute
  • 0.78 FUGTs per second  
One of the seven dirty words gets tweeted out nearly every second? We truly are number one [3]! But in order to get a better sense of the spatial distribution of this collection of twisted bilge masquerading as discourse and social commentary, we aggregated this complete pile of **** to the county level and normalized it by the total number of tweets in each county. And yes, there were indeed some non-profane tweets so this normalization exercise actually means something.

Bodily Waste (solid); the Act of Evacuation; Pretense/Lies; Expressing Amazement, Incredulity or Annoyance; Something Inferior; Something Superior (the ____)
(vulgar, noun, verb, interjection, n=22,630,879)

The first word in Carlin's sequence, another word for excrement, is by far the most popular of the seven. It accounts for over half of the total number of references in our dataset, with more than 22 million tweets. This word also presents arguably the most interesting finding of our study, in that references to this word are overwhelming concentrated in the American South. While our previous research has shown the South to be unique in its interest in church, racial issues and referring to groups of two or more people as "y'all", it is apparently also unique in its unabashed love for excremental exclamations [4].

Bodily Waste (liquid); the Act of Evacuation; Drunk (____ed); Angry (____ed); Request to leave (____ off)
(vulgar, noun, verb, interjection, n=645,100)

...perhaps we should qualify that last statement, based on our map of the second of Carlin's dirty words, as the geography of liquid excrement seems to be somewhat reversed from our previous map. While much of the South falls back into the lower values, one can also observe a greater concentration of references in central Appalachia and throughout the Rust Belt to its north. Even much of the west coast seems averse to the word, seemingly showing that it is largely the Midwest that is awash in this term.

To Engage in Carnal Congress; To mistreat (____ over) or meddle (____ with); Expressing Disgust, Anger or Rejection (____ you or ____ off); To ruin (____ up); To be concerned, usually negated (give a ____)
(vulgar, verb, noun, interjection, n=19,125,640)

Hopefully is is little more than a coincidence that another word used to refer to carnal congress -- itself the second most popular of the seven dirty words, with nearly half of the total number of hits in our dataset -- in many ways mimics the geography of the most popular of the seven mentioned above, albeit with a less pronounced concentration in the American South. Instead, this word seems to have solid clusters in the northeast and west coast, though the counties with the highest relative values seem more scattered throughout the mountain west and Great Plains while much of the rest of the country doesn't appear to give a ____.

Lady Bits; Pejorative Characterization of Individual (generally women)
(vulgar, noun, n=263,959)

Arguably the most derogatory word of the seven given that its commonly used as a tool of misogyny, this term has no real significant clustering anywhere in the continental US. It is interesting, however, that as much as we've found Southerners to love certain four letter words (see Map #1) there is a distinctly below average frequency of references of this decidedly uncouth term. 

A purveyor of oral invigoration towards a male recipient; An offensive individual
(vulgar, noun, n=6,625)

We're almost heartened by the fact that another word used to refer a purveyor of oral invigoration towards a male recipient has by far the fewest mentions of any of the seven dirty words, perhaps due to a declining societal acceptance of homophobia, which is arguably the most common use of this particular term. References here are scattered at best, thought most seem to be in the Midwest and Great Plains, with some lesser concentrations in the northeast with rural areas tending towards higher relative frequency of use. 

An individual engaged in carnal congress with another who has the status, function or authority associated with female parenting derived via biological reproduction, adoption or legal guardianship; a despicable person 
(vulgar, noun, adjective, n=159,786)

This twelve letter word, which might be used literally to refer to someone who has engaged in carnal congress with another who has the status, function or authority associated with female parenting derived via biological reproduction, adoption or legal guardianship, is definitely the most universal of the seven dirty words, with near uniform usage across the United States. While parts of the northeast and rural Great Plains have higher concentrations, this is pretty much the word you can be sure to hear no matter where you are in the good ole US of A.

Paired glands secreting matter (which is neither gaseous nor solid) for nourishment for progeniture; Given in retaliation (singular form, ___ for tat)
(vulgar, noun (plural), n=254,311)

The last of the seven dirty words, another word referring paired glands secreting matter (which is neither gaseous nor solid) for nourishment for progeniture, has relatively few references within the South, while a handful of counties in the Great Plains states seem to have a fairly significant number of mentions relative to their overall tweeting. 

In conclusion, it is evident that while Carlin saw these words as being united around their prohibition, they remain divided in both their general levels of use and acceptability, as well as in their spatial distribution. While the first and third dirty words in the sequence are much more prevalent than, say, the fifth, their spatial distributions and remarkably different, as we have shown with this series of maps. So even if we've all got stuff to be *i*s*ed off about, we all express it in our special ways. Now **c* off. 

[1]  *h**, **s*, ****, **n*, ************, **********e*, and **t*. 
[2] Imagine adding -ing or -ty, among other things, to the end of some of these words. 
[3] Yay?
[4] Band name.