Showing posts with label louisville. Show all posts
Showing posts with label louisville. Show all posts

March 11, 2015

New paper accepted in Landscape and Urban Planning!

We're very happy to announce that three-fifths of the collective have recently had a new paper accepted for publication in the journal Landscape and Urban Planning, as part of a special issue on critical visualization edited by Annette Kim, Katherine Foo, Emily Gallagher and Ian Bishop. Taylor, Ate and Matt's paper, "Social media and the city: rethinking urban socio-spatial inequality using user-generated geographic information", builds on our earlier calls to go 'beyond the geotag' in order to develop alternative conceptual and methodological approaches for the use of geotagged social media data, drawing attention to the variety and complexity of socio-spatial processes embedded in such data.

Using Louisville, Kentucky as a case study, our paper examines the socio-spatial imaginary of the '9th Street Divide', separating the city's largely poor and African-American West End from its more affluent and predominantly white areas to the east.


While a more conventional analysis of these inequalities as reflected in geotagged tweets might look a bit like the map above, we argue that such maps of isolated, atomistic dots do little to reveal the nature of inequality between places, and do a disservice to the data itself by stripping it of much of its context. So, rather than just arguing that the West End seems to have a relative lack of tweeting activity compared to other parts of the city -- and thus deducing that the digital divide is persistently reflected in this data -- we put these different areas of the city in comparison to one another in order to understand how both individuals and groups move through the city and (re)produce landscapes of segregation and inequality through their everyday practices and mobilities.


Using a novel method for analyzing this data, we attempt to demonstrate how the idea of the West End as a separate and apart from the rest of the city is challenged by the realities of people's everyday movements. Rather than being isolated, West End residents are actually much more spatially mobile within the city, while East End residents tend to be much more confined to their own neighborhoods.

So while the 9th Street Divide remains a key way of understanding and highlighting the spatial dimension of urban inequality in Louisville, we tend to think that this framing actually reinforces the understanding of the West End as a kind of 'problem area'. And while only a partial contribution to this argument, we hope that understanding the West End through its relations with, and connections to, other spaces and places ameliorates the vilification and pathologizing that is so common in discussion of racial and socio-economic inequality in highly segregated cities.

Ultimately, we hope this paper can allow for an alternative conceptualization of urban inequality in Louisville and the West End, while also demonstrating the utility of a situated and contextualized, mixed methods approach to the study of geotagged social media data, emphasizing the full range of socio-spatial processes embedded in this data that can't be captured in just a single point on a map.

The full citation for our paper is below:
Shelton, Taylor, Ate Poorthuis, and Matthew Zook. (Forthcoming) Social media and the city: rethinking urban socio-spatial inequality using user-generated geographic information. Landscape and Urban Planning.

December 18, 2014

Deconstructing the (most detailed tweet) map (ever)

If you’re the kind of person who visits our blog with any regularity, you’re almost certainly also the kind of person who would have seen some version of the map below in the last couple of weeks. Created by Eric Fischer of Mapbox, this map was released along with a blogpost entitled “Making the most detailed tweet map ever”, discussing some of the data cleaning and visualization methods necessary to produce such a striking map. The map is undoubtedly interesting and has sparked a great deal of interest from all corners of the internet, but there’s just something about the framing that rubs us the wrong way. While Eric’s post emphasizes the making part of the equation, the internet hype cycle around it has caused us to read the title a bit more along the lines of:

"Making THE MOST DETAILED tweet map EVARRRR!!!!"

That is to say, for all of the admittedly really great detail about what went into making this map, the framing of this map as not only a detailed map of six billion or so geotagged tweets, but as the most detailed tweet map ever, raises more questions than it answers. For example, what constitutes ‘detail’ in tweet maps? What do competing definitions of ‘detail’ reveal about what we value in this kind of analysis? What do these particular ideas of ‘detail’ foreclose in terms of other possibilities for analysis?

These are important questions, regardless of whether they’re applied to this particular map or any other one. The issue in this case, however, seems to be that the answers to some of these questions conflict with one another, or with the ways the project is itself described. The detail that seems to be valued here is of the “every tweet ever” variety, or, put simply “more = better”, the fetish for bigger data at the expense of all else.

But more data isn’t necessarily better, and it certainly doesn’t mean that there’s more detail, especially when the only bit of detail you're concerned with in each of these six billion points is the latitude and longitude coordinates. Each of these individual tweets contains a wealth of other interesting information, from information about the user and the way they describe themselves, to the time the tweet was created to the text of the tweet itself, which might contain hashtags that link up with bigger conversations, or @-mentions to other Twitter users that might be used to understand social networks and interactions. All of these bits of information represent a kind of detail that is not included in this, the most detailed tweet map ever

As we’ve been arguing for the past two years or so, there are a range of social and spatial processes represented in geotagged tweets that we can’t get at if all we’re concerned with is the latitude and longitude coordinates. So to say that this represents the most detailed tweet map ever serves to reify what we see as two of the most problematic assumptions of contemporary big data/social media research: (1) that more data is equivalent to better data, and (2) that the only important aspect of the data is the geographic coordinates attached to it. There's lots of interesting stuff that can be done with this kind of data, and we can do better than simply plotting points on a map and calling it a day [1].

Even if one were inclined to accept the argument that more tweets equals more detail, how should we interpret the fact that this map only visualizes about 9% of all geotagged tweets, due to the design decisions necessary in order to make the map nice and pretty [2]? Due to the existence of exact or near-duplicate coordinates that would make points indistinguishable from one another, this, the most detailed tweet map ever, actually eliminates about 91% of the detail that it seems to value most (i.e., the presence or absence of points on the map). The Gizmodo headline about the map reads, “The Most Detailed Tweet Map Ever Includes 6,341,973,478 Tweets”... except that, you know, it doesn’t [3].

Of course, there’s also good bit of imprecision in the locational accuracy associated with geotagged tweeting; our iPhones don’t come with military grade GPS units installed in them. So while Mapbox CEO Eric Gunderson was marveling at the detailed micro-geographies of an airport gate seen in the map, he was ignoring both the fact that all of those folks on the jet bridge could just have well been 40 feet away, and that a number of tweets might have been eliminated from the initial dataset due to a lack of precision in the geotagging process. Take all of that together and a lot of the detail that’s being celebrated here starts to give way to fuzziness. This map is more art than science, though the striking visuals and discursive framing give the illusion of precision and absolute insight. 

To be clear, there’s no problem with fuzziness. It’s something we all live with every day, it’s something we academics may embrace from time to time through the use of overly obtuse language. But taking all of this fuzziness and then repackaging it as the most detailed tweet map ever, comes off a bit wrong to us. These initial misgivings were only amplified when brought down to a more local level, when we saw a post from a local urbanist blog in Louisville wondering “What we can learn from where people in Louisville are using Twitter”. While relatively mundane, and certainly not nearly as celebratory, the blog’s ultimate conclusion was that "These locations [with the highest concentrations of tweets] make sense as they are places where people gather and are often held captive by events.”


This, in general, is true, but also a bit… how do we put it? Meh. More fundamentally, people tweet where people are. It comes as no surprise to anyone with even the vaguest familiarity with Louisville that people tweet in larger numbers from downtown (including 4th Street Live!), the University of Louisville campus, Bardstown Road and the St. Matthews / Oxmoor Mall area than anywhere else in the city. These are (some of) the primary gathering points on a day-to-day basis within the city.

But just identifying these locations doesn’t really help us to ‘learn’ anything beyond the fact that those are, indeed, the places with the highest concentrations of geotagged tweets in Louisville [4]. In fact, the map doesn’t even really show us actual concentrations of tweeting activity, but rather concentrations of unique tweeting locations. Take, say, two hypothetical city squares, one of just 50 x 50 meters, and another much larger one of 500 x 500 meters, both the originating point of one million geotagged tweets spread randomly over the squares. In Fischer’s method, these two squares would not 'glow' in equal amounts, but rather the larger square would show up as much more visually prominent because it has many more unique tweeting locations while many of the tweets from the smaller square would be filtered out due to a duplication of coordinates.

Further, from a data collection standpoint, all of these tweets in Louisville reveal little that isn't revealed by mapping a random sample of tweets (say 1% of tweets from 2013, see map below). If all we’re really concerned about is the question of where people are tweeting from, there isn’t much that looking at all the tweets reveals that couldn't also be found from a smaller subset, and it’s much easier to collect or analyze a few hundred thousand tweets than it is to collect 6,341,973,478 of them. But even still, all we can ‘learn’ from these kinds of maps is where people have created geotagged tweets and, to some extent, where they have not [5].


But if that’s all we can learn from this map, again, why call it the most detailed tweet map ever? Again, there are any number of details that are excluded from analysis by only looking at the locations of geotagged tweets. What if we instead took a different approach to this data, such as examining at the use history of individual Twitter users, or even collectives of Twitter users based on some kind of shared experience or identity, such as association with particular neighborhoods or other places?

OK, you're right. This particular question is a bit self-serving, as this is precisely the kind of thing we've been working on for some time now. And so rather than just offering a critique of someone else's work, we really want to see if we can push this kind of analysis in more productive directions. So we offer up the map below, which comes from a paper we currently have under review, that attempts to demonstrate how geotagged tweets can help us to better understand urban socio-spatial inequality beyond simply identifying the presence or absence of tweets in a given area, as is so often done.


Using Louisville and the now-common ‘9th Street Divide’ trope as a starting point, we sought out to understand how people from different parts of the city used and moved around the city in different ways. So in a manner not uncommon to some other things Eric Fischer has done previously, we identified a number of Twitter users as belonging to one of two groups, those with close ties to either the West End (traditionally a poorer and predominantly African-American part of the city) or the East End (a more affluent and largely white part of the city), and collected all of the geotagged tweets from those users [6]. We then compared the spatial footprint of these groups' tweeting activity via an odds-ratio measure. On the map areas in purple represent places with greater-than-usual levels of West End user tweeting activity, while orange hexagons represent places where East End users were relatively more dominant than expected. Those places which demonstrate roughly equivalent or expected levels of tweeting are signified by those hexagons with hashes.

This map, in short, represents those places in the city of Louisville which are more socially heterogeneous and homogeneous, dominated either by West End or East End residents, or characterized by a relative mix of people from parts of the city. Though it’s evident that there is indeed a kind of divide between the West End and the rest of the city, this map also shows that West End residents are incredibly spatially mobile within the city, while East End residents tend to be much more spatially constrained, sticking to their own parts of town.

While there are certainly a lot of underlying factors driving this process, suffice it to say that this map provides an alternative way of understanding socio-spatial inequality than simply identifying those places that do or do not have significant concentrations of geotagged tweets [7]. Through our analysis, we also learned that contrary to the kind of assumptions often made about this kind of informational inequality, West End users actually produce a significantly greater number of geotagged tweets than their East End counterparts, it's just that many of these tweets are created in other parts of the city. This is, of course, an important kind of detail that we can draw from the mapping and analysis of geotagged tweets and one that, in many ways, is more detailed than the most detailed tweet map ever.

There is, of course, a whole lot more detail in the paper that this one map and blog post can’t capture, just as is the case with Eric Fischer’s map. Just to be clear, we think Eric Fischer does some fantastic and beautiful work with geotagged social media data, and commend him for openly discussing and sharing his methods. And yet, we can’t help but feel like the characterization of his map as being the most detailed tweet map ever is at best a half-truth, and helps to reproduce some of the most common problems with the analysis of geotagged social media data. But the more we think about it, we’re not so sure that a single most detailed tweet map could exist, or that it’s even desirable to have such a thing. Instead, we should be striving to create any number of highly-detailed, geographically-situated tweet maps, that collectively contribute to better understandings of the complex social and spatial processes that are represented and reproduced through this kind of data. 

----------------
[1] That’s the royal we. 
[2] Which it most certainly is.
[3] As Fischer notes, there are actually no more than about 590 million dots on the map due to his filtering process. When one zooms all the way out on the map so that the entire globe is represented in a single map tile, there are only 1,586 visible tweets, a far cry from the 6 billion number that seems so, well… big.
[4] #tautology
[5] This is qualified in this way because, as Kenneth Field pointed out in a Twitter exchange with Eric Fischer about these maps, geotagged tweets that he has consciously created from his house do not appear on the map. So while we know that all of the tweets on the map were created in that place, we can't say definitively that tweets were not also created in places where they do not appear on the map.
[6] In order to do this classification, we collected all geotagged tweets created within the defined boundaries of these two areas, and then identified those users with more than 40 tweets within either area, where those 40+ tweets represented greater than 50% of their overall geotagged tweeting activity. This concentration of activity indicates that users had a strong association with, and presence within, either area, while also making sure that no users were identified as belonging to both areas.
[7] We also see this map as complicating the conventional narrative in Louisville of 9th Street as representing a kind of impenetrable barrier within the city. But since this is less directly relevant to our argument here, we'll make you wait to hear more about that particular line of reasoning.

August 17, 2014

Mapping the #LouisvillePurge

The only way to introduce this post is to say that yes, a bunch of really naive and/or, in the case of the local television news media, willfully idiotic, people thought that there was going to be a 'purge' -- a 12 hour period where all crime is made legal -- in Louisville, Kentucky on the night of Friday, August 15th, 2014. Starting with a single tweet from a local high school student, things quickly grew out of control, with #LouisvillePurge becoming a trending topic nationally by the time things were all said and done. While the best tweets referencing the purge made light of the phenomena, there were many, many more expressing confusion, fear, bewilderment and a desire to save the poor souls who might have been convinced to participate in such an event. But for all the attention given to the role of social media in spreading the hysteria [1], there's been no attempt to look at the where some of these tweets were coming from, and how the news spread over space and time.

While the tweet that kicked the whole ordeal off was created at 8:32pm on Sunday, August 10th, the first geotagged tweet with the #LouisvillePurge hashtag didn't show up for another couple of days, at 11:33pm on Wednesday, August 13th. Beginning with that tweet, we collected all geotagged tweets with the hashtag through noon on Saturday, August 16th, at which point things were dying down.

The map below shows the overall distribution of these 4,351 geotagged tweets, aggregated to hexagonal cells across the continental United States. While Louisville and the surrounding areas clearly have the highest concentrations, the discussion of the Louisville Purge was truly trans-local, with less than 25% of the total number of geotagged tweets coming from the Louisville Metro area. Of areas further away from Louisville in absolute distance, Houston, Dallas and Los Angeles represent some of the highest concentrations of tweeting about the (non-)event.

All #LouisvillePurge Tweets thru August 16th at 12pm EDT

But perhaps more interesting than just the overall spatial distribution is how this distribution evolved over time, from the first geotagged tweet all the way through the cycle of hype and hysteria that led the Louisville Purge to be featured on any number of national news websites. In the series of maps below, we have divided all of the tweets in our dataset into a series of (more-or-less arbitrary) time frames that give a good idea of when and where the news spread to other parts of the country [2].

The lead up to the purge demonstrates a relatively localized phenomenon within Louisville, though it's interesting that there is some extra-local tweeting from the very beginning, with a very small number of tweets coming from outside the state in West Virginia, Kansas, Texas and Florida. There were only a total of 182 geotagged tweets referencing #LouisvillePurge in this 44-hour aggregate time span, with tweets originating in Metro Louisville representing 55%, 66% and 60% of the total number of tweets with the hashtag during the three periods, respectively. In other words, talk of the purge spread quite slowly over the course of the week.

Time #1: 42 tweets
From August 13th at 11:30pm to August 15th at 6am

Time #2: 36 tweets
From August 15th at 6am to 4pm

Time #3: 104 tweets
From August 15th at 4pm to 8pm 

The number of tweets with the hashtag exploded right around 8pm on Friday night, the 'official' start time of the purge. This four hour time period represents the peak of tweeting activity around #LouisvillePurge, attributed largely to the fact that this is when the event started to diffuse outward beyond the city's boundaries to places both near and far. One can see both a significant increase in the amount of tweets across Kentucky, as well as to far-off cities like Los Angeles, Milwaukee, D.C., Philadelphia and New York City. From 8pm to 12am, the 757 tweets from Metro Louisville represent only 30% of the 2,533 tweets across the country, further highlighting the spatial diffusion of information about, and interest in, the purge. In fact, this measure of locally-concentrated tweeting drops even lower to less than 10% from the hours of midnight to 6am (when most Louisvillians would be asleep), though it again rebounds a bit higher to 23% during our final time span of 6am to noon on Saturday the 16th, after the purge has 'officially' ended.

Time #4: 2,533 tweets
August 15th at 8pm to August 16th at 12am

Time #5: 1,420 tweets
From August 16th at 12am to 6am

Time #6: 216 tweets
From August 16th at 6am to 12pm

Like our earlier research on #LexingtonPoliceScanner in the wake of the 2012 Kentucky Wildcats basketball championship, we can clearly see an ebb and flow in the way the event originates in a fairly localized area before gaining a larger following and eventually slowing down and becoming more localized again as many users reflect upon the aftermath. But unlike the attention paid to the #LexingtonPoliceScanner in large cities around the country, and especially the South, the interest in the #LouisvillePurge tended to be somewhat more diffuse, without any single location outside of the city or state paying a disproportionate amount of attention to the events.

In the end, we're happy to report that all of the Floatingsheep emerged from the purge unscathed and thoroughly amused, and we hope the same can be said for all of you and your loved ones. And do remember, don't trust everything you read on Twitter [3, 4]!

-------
[1] Again, it's probably worth noting -- somewhat ironically, I suppose -- that despite the rumor originating and being passed around via social media, it was the traditional local television news networks whose willingness to believe and highlight the rumor drove further attention to the situation, which was almost obviously a farce from the very beginning.
[2] You can also access an animated GIF version of this time series map here.
[3] Especially if you are supposed to be a "real journalist"!
[4] For that matter, don't trust everything you see on the television news, either!

December 27, 2012

The Bluegrass Basketball Battle

In Kentucky, basketball means everything -- especially college basketball, and especially the intrastate rivalry between the Kentucky Wildcats and the Louisville Cardinals, one of the greatest in all of college sports. Growing up in Louisville, one can't help but choose sides and develop one's debating skills, arguing with classmates, family and friends over whether Patrick Sparks traveled in 2004 or whether Rick Pitino is the modern-day basketball equivalent of Benedict Arnold. But given our connections to the University of Kentucky (and Taylor's fandom), the upcoming game and the tools at our disposal, we thought it might be time to wade in on the age-old debate between the two sides.

A recent public opinion poll of Kentucky by Public Policy Polling piqued our interest, as it found that Kentucky fans outnumber Louisville fans in the state by an overwhelming 66% to 17% margin. But how do the two fanbases stack up on Twitter?

We took to DOLLY to collect references to the two general-purpose hashtags used by fans of each team and promoted by the respective athletics departments -- #BBN (for Big Blue Nation) and #L1C4 (for Louisville First, Cards Forever) -- in geotagged tweets created between June 21, 2012 and December 20, 2012, in order to measure the both the absolute numbers and geographic distribution of UK and UL fans at the national, statewide and local scales as reflected by Twitter.

Number of Tweets referencing #BBN or #L1C4
According to the aforementioned poll's 66-to-17 margin, there are ~3.9x more UK fans than UL fans in Kentucky. This finding is mirrored almost exactly by our measures of tweeting, where the 6,371 geotagged references to #BBN in the state are also 3.9x greater than the 1,628 references to #L1C4. And while the number of tweets for each team are essentially equal within the city of Louisville, UK fandom becomes even more dominant once one moves outside of the Commonwealth, with there being over 10.5x more #BBN tweets than #L1C4 tweets in the US outside of Kentucky, for a total of 4.9x more UK tweets than UL tweets nationwide. So not only does UK hold an ever-so-slight advantage within Louisville's homebase, it shows increasing popularity as one moves to the larger scales of the state and nation.

#BBN vs. #L1C4 Nationwide
But when we visualize these tweets, we get a better idea for just how geographically concentrated these patterns of fandom are. For instance, 599 of the 3,141 US counties had references to either #BBN or #L1C4. But of these, only 35 counties had a greater number of references to #L1C4, with Butler County, KY holding the dubious honor of being the only county in the Commonwealth with more references to #L1C4. Of the remaining counties, 554 had more references to #BBN, and only 10 counties in the country had an equal number of tweets referencing #BBN and #L1C4.

#BBN and #L1C4 in the Commonwealth of Kentucky
Also interesting is that no county in the US apart from Jefferson County, KY (Louisville and Jefferson County have a merged government, and so are coterminous) has more than 100 tweets with references to #L1C4, highlighting the essentially limited spatial distribution of UL fans. And though Jefferson County does have a few more UK tweets than UL tweets, one doesn't have to go far to find the county with the largest margin of UL-related tweets over UK-related tweets; right across the river from Louisville in Clark County, Indiana there are 20 more #L1C4 than #BBN tweets.

Meanwhile, Kentucky holds a decisive advantage in its hometown of Lexington-Fayette County, with 1,588 more #BBN tweets than #L1C4 tweets. But the county with the second-highest margin favoring UK is all the way south in Broward County, Florida (Ft. Lauderdale) with a +299 margin favoring UK.

#BBN vs. #L1C4 in Louisville
Within Louisville, the absolute number of tweets are almost equal, as mentioned previously; but, interestingly enough, the geographies of UK and UL tweeting are quite different. The clustering of #L1C4 tweets tends to be around the UL campus and downtown areas, while UK tweeting tends to be more spatially distributed, with many tweets coming from more suburban, residential areas in the city. So while the vast majority of UL tweets across the country are located in Louisville, a still significant number come from within just a handful of square miles surrounding the UL campus in downtown Louisville, perhaps indicating the limited appeal of a team that's lost four-straight games to the defending-champion Wildcats.

#L1C4? More like #L1C4.9xLessPopularThanUK.

UPDATE: See today's article over at ESPN.com, "The Commonwealth's great divide", which discusses some of the same geographic dimensions of UK and UL fandom we are showing here. It includes this interesting passage:
In 2005, the Courier-Journal polled fans on their sports loyalties and 53.7 percent within the city counted themselves as UL fans compared to just 33 percent who identified themselves as Cats fans. And according to the two schools' alumni associations, Louisville understandably has a far greater base in Jefferson County (54,872 living alumni) than Lexington (16,112). 

But here's the catch: There are just 22,160 living Louisville alumni in the rest of the state and other than Fayette County (where Lexington sits), none of Kentucky's 120 counties boasts more UK grads than Jefferson.
While we weren't aware of these figures at the time of our initial post, they not only tend to confirm some of our findings, but indeed only lend even more credence to our assertion that UK fans seem to be more voracious tweeters than their UL counterparts, as the roughly 50-50 split in tweeting in Louisville is significantly askew from the 54-33 numbers from the Courier-Journal's 2005 survey.