Note: as of 11:00 EST 11/10/2012, we have disabled commenting on this post.
Note: at 10:00 am EST 11/12/2012 we posted an analysis using the same methodology as this post to locate the epicenter of earthquake in Eastern Kentucky over the weekend.
During the day after the 2012 presidential election we took note of a spike in hate speech on Twitter referring to President Obama's re-election, as chronicled by Jezebel (thanks to Chris Van Dyke for bringing this our attention). It is a useful reminder that technology reflects the society in which it is based, both the good and the bad. Information space is not divorced from everyday life and racism extends into the geoweb and helps shapes its contours; and in turn, data from the geoweb can be used to reflect the geographies of racist practice back onto the places from which they emerged.
Using DOLLY we collected all the geocoded tweets from the last week (beginning November 1) with racist terms that also reference the election in order to understand how these everyday acts of explicit racism are spatially distributed. Given the nature of these search terms, we've buried the details at the bottom of this post in a footnote .
Given our interest in the geography of information we wanted to see how this type of hate speech overlaid on physical space. To do this we aggregated the 395 hate tweets to the state level and then normalized them by comparing them to the total number of geocoded tweets coming out of that state in the same time period . We used a location quotient inspired measure (LQ) that indicates each state's share of election hate speech tweet relative to its total number of tweets. A score of 1.0 indicates that a state has relatively the same number of hate speech tweets as its total number of tweets. Scores above 1.0 indicate that hate speech is more prevalent than all tweets, suggesting that the state's "twitterspace" contains more racists post-election tweets than the norm.
So, are these tweets relatively evenly distributed? Or do some states have higher specializations in racist tweets? The answer is shown in the map below (also available here in an interactive version) in which the location of individual tweets (indicated by red dots) are overlaid on color coded states. Yellow shading indicates states that have a relatively lower amount of post-election hate tweets (compared to their overall tweeting patterns) and all states shaded in green have a higher amount. The darker the green color the higher the location quotient measure for hate tweets.
Click here to access an interactive version of the map at GeoCommons
- Mississippi and Alabama have the highest LQ measures with scores of 7.4 and 8.1, respectively.
- Other southern states (Georgia, Louisiana, Tennessee) surrounding these two core states also have very high LQ scores and form a fairly distinctive cluster in the southeast.
- The prevalence of post-election racist tweets is not strictly a southern phenomenon as North Dakota (3.5), Utah (3.5) and Missouri (3) have very high LQs. Other states such as West Virginia, Oregon and Minnesota don't score as high but have a relatively higher number of hate tweets than their overall twitter usage would suggest.
- The Northeast and West coast (with the exception of Oregon) have a relatively lower number of hate tweets.
- States shaded in grey had no geocoded hate tweets within our database. Many of these states (Montana, Idaho, Wyoming and South Dakota) have relatively low levels of Twitter use as well. Rhode Island has much higher numbers of geocoded tweets but had no hate tweets that we could identify.
But lest anyone elsewhere become too complacent, the unfortunate fact is that most states are not immune from this kind of activity. Racist behavior, particularly directed at African Americans in the U.S., is all too easy to find both offline and in information space.
--------------------- State Level Data ---------------------
The table below outlines the values for the location quotients for post-election hate tweets.
|State||LQ of Racist Tweets||Notes|
|District of Columbia||1.5|
|Alaska||-||see note 1|
|Idaho||-||see note 1|
|South Dakota||-||see note 1|
|Wyoming||-||see note 1|
|Montana||-||see note 1|
|Hawaii||-||see note 1|
|Vermont||-||see note 1|
|Rhode Island||-||see note 2|
Note 1: no racist tweets, SMALL number of total geocoded tweets
Note 2: no racist tweets, LARGE number of total geocoded tweets
 Using the examples of tweets chronicled by Jezebel blog post we collected tweets that contained the text "monkey" or "nigger" AND also contain the text "Obama" OR "reelected" OR "won". A quick, and very unsettling, examination of the search results revealed that this indeed was a good match for our target of election-related hate speech. We end up with a total of 395 of some of the nastiest tweets you might possibly imagine. And given that we're talking about the Internet, that is really saying something.
 To be precise, we took a 0.05% sample of all geocoded tweets in November 2012 aggregated to the state level.
 The formula for this location quotient is
(# of Hate Tweets in State / # of Hate Tweets in USA)
(# of ALL Tweets in State / # of ALL Tweets in USA)
 We should also note that the precision of the individual tweet locations is variable. Often the specific location shown in a map is the centroid of an area that is several tens or hundreds of meters across so while the tweet came from nearby the point location shown it did not necessarily come from that exact spot on the map.
Note, though, the strong autocorrelation with pro-Romney voting (the top 9 racist-tweet states all were Romney states). This could indicate that people in high-LQ states are racist and that led them to vote for Romney (and, perhaps, that same element of racism makes it acceptable for the most extreme among them to do racist tweets). However, it also could be that in ALL states a certain subpopulation of Romney supporters are racist tweeters. Since these "red" states tend to have more Romney supporters a higher percentage of their tweeters will be racist. It would be interesting to see the data normalized by Romney vote.ReplyDelete
Regardless of how you figure it, I am so disappointed that, once again, Alabama's the worst. The thing is, there are MANY of us down here that are not racists. As a matter of fact, I think the percentage of votes for Barack was higher this time than his previous run. Just not high enough.Delete
I love my home; but I hate its reputation... and the fact that it's deserved.
I noticed a trend while studying the map. Most of the locations indicated on the map in Alabama are either from the Black Belt of from cities with a majority minority population. I too hate to see this correlation with my beloved state. I have worked in education from the days of segregation and well after the schools were integrated. I have witnessed great strides in racial equality in our state. I would hope that the world would just come and visit Alabama instead of forming opinions based on the news media and reports such as this that do not reflect the conditions here.Delete
This study is fraudulent, inaccurate, and racist in itself. For one, it is not a random sample. You SEARCHED for tweets with those specific words rather than reflecting the entirety of tweets sent in the state. As a Tennessee resident, I saw many more than 395 tweets come in after the election. I also saw zero that said nigger or monkey in them. Quit spreading propaganda against other states, because all you're doing is adding to the hate yourself.Delete
So conducting research instead of relying on vague anecdotal observation is racist now? Of course he's going to SEARCH for tweets with racist words in them, how else is one supposed to FIND tweets with racist words in them?Delete
Taylor, you need to read the research methods. The number of racist tweets was normalized to the total number of tweets for that state. It's the same as asking "what proportion of kindergartners wear glasses in each state?" To determine this, you count the number of kindergarten kids with glasses, and the total number of kindergarten kids, and divide the first by the second, for each state. Thus (for ex), in New York State, 200 thousand out of a total of 1 million kindergartners wear glasses, or 5%. Alabama, 2%. And so forth. (no, those aren't real numbers. I have no clue what percentage of kindergartners wear glasses-but it is a good estimate of the number of kids that age in NYS). Point is, 1. the method is simple and obvious; my 2nd grader does math problems like this all the time ("there are 20 kids in your class. 8 of them are girls...etc.); and 2. you can't figure out the percentage of kindergartners who wear glasses without counting the number of kindergartners who wear glasses, as well as the total number of kindergartners. And no one person is going to see all tweets from a state; if you aren't seeing those words, it probably means the people you are connected with on Twitter don't use those words, huh?Delete
So, one last thought: Racism exists everywhere in the Human population and I believe it's based on fear and ignorance. It's not just a localized phenomenon. Humans have a tendency to forget most people have good intentions - especially when others are different races, religions, etc. One thing I have always loved about our President is that he recognizes the fact that we're more alike than we are different. If we will focus on our similarities, rather than our differences and respect each other, there'll be no stopping us.Delete
Thanks unknown for the explanation.Delete
Taylor, you mentioned you "saw many more than 395 tweets come in after the election". The reference to the number 395 in the article was to the total number of hate tweets found using their stated methods. Obviously there were millions of tweets from all over the country after the election - of all those millions of tweets, these researchers took 0.05% of them, searched that random sample for the aforementioned words, and found 395 hateful tweets that used those words. They then counted the number of hateful tweets that came from each state and compared it to the total number of tweets from that state in their 0.05% sample to create the location quotient to rank the results. The 0.05% is a random sample, the 395 is the number of results found within that random sample. The study is neither fraudulent nor racist, but your charge of inaccurate might hold water as this is an especially small sample size.Delete
I am curious to know if the study allowed for the common use of the word "nigger" as it is used in casual conversation among African American tweeters orr were those incidents included in the study and assumed to be racist hate speech.Delete
@pwillis: Per above: "we collected tweets that contained the text "monkey" or "nigger" AND also contain the text "Obama" OR "reelected" OR "won""Delete
So yes, unless you count a tweet from a black person to his community saying "woohoo, our won" or some such thing.
Twitter is a public platform. As presented, anyone can listen to what people tweet. I would think casual use of words like that are much less present in twitter than in personal correspondence, so I would imagine that the majority of these tweets are properly ascribed to racism.
Did they search include terms like "whitey"? I saw a number of tweets with phrases like, "take that, whitey!" or worse.Delete
I'm not defending any racism, just noting that sometimes we seem to forget that you don't have to be white to be racist.
Yeah, no $#!+. In the same breath, people thinking everyone who voted Romney is racist will immediately make fun of his whiteness and Mormonness. Anybody see the Tumblr for White People Mourning Romney? What makes that more okay than a hypothetical Tumblr of "Black People Acting Like Fools Because Obama Won"?Delete
See the FAQ for our response to some of these issues.Delete
Phil, Interesting point....there does seem to be about a 50 percent correlation at the state level between the number of racist tweets and the number of votes for Romney. But of course the twitter data doesn't contain a measurable/testable linkage between the two.ReplyDelete
395 total hate tweets is *really* small sample size considering that you're doing a 50-state analysis.ReplyDelete
Does this mean the "more racist" states had 20 racist tweets while the "less racist" states had 5?
Don't get me wrong - I find your conclusions plausible, but the small sample size makes it hard to draw conclusions at the state level.
Interesting that you don't bother to track all the hate-filled tweets in the days leading up to the election issued by people threatening to do Mitt Romney harm should he win the presidency. It cuts both ways.ReplyDelete
Hmm, I've never see a single mention of that, care to share a link?Delete
Not the best websites but still have some examples.Delete
Sorry, I don't find those two sources to be credible. Can you find a more unslanted link?Delete
lol you find infowars incredible but you have just read an article about racist tweets that didnt give any criteria of what the racist words were? did you know that basketball and golf are racist words when applied to Obama? Obamacare opposition is racist....Hell if you voted for Romney you are a racistDelete
theBEC88 regardless of whether the sources are credible, they reference actual tweeets. Is Twitter not a credible source for tweets?Delete
Here's an example from another source, one with clear right bias, but one which links actual tweets -
Vincent Abate, theBEC88, do you now have proof that there were threatening and racist tweets directed at the Right?
I noticed that most of these tweets come from teens that don't even look old enough to vote.Delete
Was age mentioned in the formula criteria?Delete
See our response to these issues here...http://www.floatingsheep.org/2012/11/faq-mapping-racist-tweets-in-response.htmlDelete
Collin- age was not included, no good way data on age demographics of twitter users
Could the high LQ in La. be from a hedonic calculation and due to the contiguity to one of the core states?ReplyDelete
JK, I'm from La. and I'd like to think we're better than Ms and Al.
@x004Ronin - good point, there is a sample size issue here which we tried to convey with the individual points in the map. Plus if we had mapped this at larger spatial units, e.g., census regions, we would have end up showing the same specialization in the southeast, but given that elections are about electoral votes, the state level was more interesting, especially since the number of tweets of the states with the highest LQs were around 15+. Also we tried to highlight the relatively small samples for Utah, Minnesota and North Dakota. Still a good point and one of which we're aware and working with.ReplyDelete
@Fifi Belle, if you can give us some examples of racist tweets directed at Romney we'll take a look
Matthew, what is the point you are trying to prove? Obviously you didn't try to find racist tweets aimed at Romney, a simple search of terms such as "white devil," "honkey," "cracker," "black-power," shows plenty when paired with Romney, or Mitt, or the election. Now I have no idea how the statistics compare on twitter, but it looks to me like you are just trying to cause a partisan stir, without even attempting to be fair-minded about his.ReplyDelete
@CCC: First of all, there is a complete lack of equivalence between the terms that you list being used to describe Romney and the terms used to describe Obama. Second of all, combinations of the terms you list yield, in the aggregate, a few dozen hits in the same time as our analysis above. So no, these terms don't show "plenty" of instances, nor would it be remotely the same even if they did.Delete
Really, which term do you think is more offensive than "white devil?" Also what term combinations were you searching for specifically? Did you filter out tweets which were using certain terms in a positive light, did you filter out reposts where people were criticizing prior racist tweets?ReplyDelete
To answer your first question with another question, has the term "white devil" been used for centuries to systematically subjugate an entire race? When it has, we can have this conversation again.Delete
Well, 5here we go. You write a hit piece where you only run the numbers from one said, AND you start with the assumption that only white people can be racist, AND you don't take the effort to filter out "racist" tweets from black people about Obama which aren't actually negative towards Obama.Delete
Perhaps you did go through your data and make sure the tweets actually are what you think they are, racist tweets against Obama. I'd love to see the 345 you included in your analysis.
Ignoring that "white devil" has been used to denigrate caucasions for centuries, Taylor, am I to understand that racism is not a problem when it's only been around a short time? Or that a words use to denigrate (which is all a word can do) a person is not racist when that person is not part of a race that is subjugating another race? That last removes ANY word from consideration unless you believe that the culture in the US suffers from more than just prejudices (which are not only directed from white to black) but from a societal norm where subjugation and slavery are accepted as normal by most.Delete
A censored example is below: not only might your survey rate this as a racist tweet against Obama, but it was actually an offensive tweet against Romney - though I can't be certain because you don't include your selection criteria.ReplyDelete
Tevita Maliu @TPuss
Obama is my n*****. Fuck that fa****t Romney a*s bi**ch. P**sy hating, no orgasm having motherfucker.
9:28 PM - 6 Nov 12 · Details
@CCC read footnote one for all the details on our methods. I dare say it is more offensive than "white devil". Plus keep in mind we are NOT looking at all tweets just geocoded tweets so your example likely did not make it in.ReplyDelete
And as Taylor noted there are roughly about 10 times as many tweeted directed at Obama as Romney and using much stronger terms than you suggest.
btw, we're happy to engage but will turn off comments if it turns into a flame fest.
Seriously, we can agree to disagree on the offensiveness of certain terms, but lest stick with the numbers.Delete
Did you filter out rebuttals of prior offensive tweets, or tweets that used terms like "Obama is my n*****?"
@CCC - I was simply responding to your previous argument about what should be considered a racially charged tweet. While reasonable people can disagree on some specifics, it is also clear that certain terms (see footnote 1) have a disproportional association with racism and the history of the U.S. Which is what this post is about.Delete
But I'm happy to return to numbers. Using DOLLY (which tracks geocoded but not all tweets), and do the following searches are results are as follows
- white devil AND Romney = 10 hits
- honkey AND Romney = 3 hits
- cracker AND Romney = 42 hits
- black power AND Romney = 7 hits
Which is about 15 percent of the number of searches we turned up for this post. And since we didn't filter out the Romney posts or the Obama posts (more on that in a bit) we can directly compare the size of these two results. Simply put, there are significantly more hate tweets directed at Obama than Romney.
We didn't filter the results for this post but did inspect the tweets and it was evident that most were good matches for what most people would consider racist sentiments. Also any filtering effect is likely to be independent of geography and so we would end up with a slightly smaller sample but with the same general distribution of tweets and LQs. Which would result in the essentially the same map.
I short we stand by our results.
This is very interesting and I don't envy you going over the horrible tweets. I would like to point out what may be a case of internal bias, however. While the article points out that North Dakota's shade is higher due to the number of racist tweets against the low number of tweets overall, there seems to be an internal bias, perhaps East Coast in origin?, that seems to assume between the lines that states such as Montana, South Dakota, Idaho, and Wyoming *would* be racist if they had more tweets. If there had been even one tweet that was racist they would be shaded at least yellow. It might be a good idea to accept the possibility that these states contain at least tweeters who aren't displaying racism, or at the very least do not use the terminology in question. Hawaii, which also fell into the fewer tweets no racist ones category, was somewhat pointedly left out of mention in the paragraph discussing this and appears only in the list, perhaps because it is widely acknowledged as being a very multi-ethnically complex state, and thus would not fit assumptions likely made about the other grouping. Worth thinking about.ReplyDelete
@unknown Agreed. The lack of tweets in these states likely is tied to the relatively small number of tweets in general. That is why they are not shaded to indicate "no data" which is different than low levels of hate tweets.Delete
Hmmm...not quite what I was saying. I was saying that there were no hate tweets, so shouldn't that be an "I'm happy to report" as in Vermont, rather than the backhanded implication, reiterated in you reply, that these states would have hate speech if they had more tweets. Hawaii is neglected in the grouping because the cultural assumption is that they would not have hate tweets and there lack-thereof is a sign of lack of hate, even though they are under the same low tweet level as Wyoming. I'm stating that the presumption of hate from that area is a cultural bias in and of itself. :) But you seem to be repeatedly being put on the defensive here for rather silly reasons (saying bad things about whitey is not equal to institutionalized racist response) rather than congratulated for the hard work, which I feel is more in keeping with your study and its goals.Delete
Unknown 9, Thanks for the congratulations and yes, the response should been similar, it has been a rather long dayDelete
Not to worry, I understand. I like being Unknown 9, but that's just Unknown plus the first part of the date. It was a long day, wasn't it? I think the idea and the map are really cool and just FYI, I'm Stana. :DDelete
Nice piece of research, looks like a lot of people on Facebook are liking it: http://politics.trendolizer.com/2012/11/floatingsheep-mapping-racist-tweets-in-response-to-president-obamas-re-election.html (look at the graph)ReplyDelete
If you look at a map of pro slavery states and free states, this also correlates. http://www.sonofthesouth.net/slavery/slave-maps/map-free-slave-states.htm The past is the past, but not when hate is taught in the home.ReplyDelete
Does the map look very different for other presidential elections when both candidates were white? Of course not. Some areas of the country are historically republican while others are historically democrat.Delete
Actually, importantly, the south is not historically Republican. For many years, the South voted solidly democrat. The shift from support for Democrats vs. Republicans happened around civil rights issues in the middle of the 20th century. In other words, the shift from supporting the Democratic party to the Republican party in the south happened around racial issues. I'm a southerner and while I would like to be able to argue that there is not more overt racism between African Americans and Caucasians in the south than in the rest of the country, history suggests otherwise. States which owned slaves tended to support the Democratic party in the years leading up to and following the end of slavery, because the Democratic party supported the continuation of slavery, and continued segregation after the end of slavery. Once party politics began to shift in the middle of the 20th century, and the Democratic party began to more actively support civil rights platforms, southern support for the Democratic party began to weaken, and there was even a split in the party over racial issues. The real completion of the shift towards a Republican south happened during the 60s and 70s at the height of the civil rights movement, and in response to continued battles for integration.Delete
In other words, the south is only "historically republican" for about 50 years, and that history is deeply rooted in racism.
No "State Level Data" listed for VermontReplyDelete
Oops....Vermont was missed because I'm happy to report that there were no hate tweets found in VermontDelete
Probably because Monica moved away.Delete
I find that there are several scientific reasons to question your results, including bias and regionalism. And I have one subjective one: I know many fine people in Alabama of many different races and cultures, and I am one. Beauty can only be seen when we don't focus on the negative. The researcher seems to be missing a lot.ReplyDelete
I concur..Get a Life...this stuff reminds me of the stuff people with Aspergers Syndrome dwell uponDelete
Would anyone care to respond to this?ReplyDelete
Hi all, read my response (see above) to earlier comments raising the same issue.Delete
Sorry, should have read the comment section. What makes a tweet geocoded? Because there were tons of "f white people" to make it trend in places. Just wondering why few of these were geocoded.Delete
Geocoded tweets are tweets in which has some location information attached...it is about 2 to 5 percent of all tweets.Delete
I've not seen any of these local trends so can't speak to that directly. But when I search for your suggested term in our database we only end up with 30 results compared to the 395 racist tweets directed towards Obama.
Thus, we conclude that the phenomenon we are mapping is an order of magnitude larger than what you are suggesting we search on. That's a big difference.
Thanks for the info.Delete
The problem is, without knowing what kinds of people allow their tweets to be geocoded you have no idea how representative the tweets are of the general population. Sure, it seems logical that Alabama would be among the most racist state. But if the people tweeting with geocode aren't representative, it could be more or less racist than what you see.Delete
Did you correct for population in your analysis? For example because America is 72% white and 12% black, and if 15% of blacks were racist against whites and 15% of whites were racist against blacks, and they all tweeted racist rants, you would see 6 times more racist tweets from whites based on population alone.ReplyDelete
Also, because whites tend to be higher on the socioeconomic ladder, you would have to correlate for the fact that more whites would be on Twitter in the first place, thereby raising their tweet counts.
@Boofoo. There aren't good demographics on Twitter users and I'm bit dubious of the goal here.Delete
Great stuff - love the analytical mind behind it regardless of the results...you don't accept data that I've seen that African Americans are over-represented on twitter?Delete
Contrary to the OP's assumptions, twitter is accessible to many people via their smartphones/cellphones. In many low-income areas, low-cost providers for cell phone service offer those lower on the socioeconomic ladder access to the internets.
Why don't you also include the racist terms "Whitey", "Cracker" and "Honky" as part of this analysis?ReplyDelete
Read the entire comment thread. Your question answered in detail—-with data--above.Delete
I am just curious, when you looked ínto the racist Romney comments, did you use the same time frame? If so, why, people who are racist towards Romney had much less cause, even if unjustified, post election as opposed to pre election.ReplyDelete
Exactly what I was thinking, you can bet if Romney would've won the racist Romney tweets would have been through the roof!! Funny no response to this!!Delete
Yes same time frame from November 1 to November 7Delete
My feeling as well. Of the out come of the election had been different so would the intensity of the tweets against Romney. Just like my favorite Michael Moore add about "burning this Mfer roof down"...and she was oldwhite woman lol.Delete
To the Author/s of this post:ReplyDelete
You work so hard to collect data to only then make a correlation/causation flaw in logic? A racist tweet means the person is racist, and because more of those tweets happened in an general area, then all the people in that area are racist... Oh, there are a majority of Republican voters in or around there? Well then surely all the Republicans must be racist... This is an exercise in stupidity.
Racism exists; it's a very real tragedy. Some people are close-minded and choose not to stretch their mental capacity to accept people who are different from them. Think about it for a second and it starts to look like you are not too far off from this exercise yourself.
I wonder which way you voted? If you want to look for some reason to dismiss an entire side of the political spectrum, turn on the "Daily Show" and live in that world. Don't go looking for a repulsive characteristic and use correlative data to paint the entire right side of the political spectrum with it. Democratic voters may celebrate this with you, but know that you are choosing to backhand the other half of the people in this country with your actions. It honestly saddens me to think that people will go to this length to justify their personal world view in their own minds.
@David -read the post. We were extremely careful in our discussion to say exactly what you are suggesting.Delete
This comment has been removed by a blog administrator.Delete
Fascinating idea. I was disappointed, though, that you didn't correct for multiple racist tweets by the same individual. That could easily skew results for an entire state. Until you filter for that, I don't find much value here.ReplyDelete
Good point Clare- I wondered the same thing. Having a few people who tweet racist things all day could skew the numbers considerably (especially since they only found 395 racist tweets). I'm not sure what the point of keeping multiples in except to possibly keep the numbers high or just laziness.Delete
There were more than 395 racist tweets. They found that many geocoded tweets. Explanation of that upthread. And their racist remarks were very specific flags/word triggers. I'm sure there were more offensive racist remarks to be found, but that they were not on the filter's radar.Delete
Unknown- yes they found that many geocoded tweets which is what this small study was about. It doesn't make sense to not adjust for multiple racist tweets by a single user. If a state has 20 racist tweets and 18 of them are from 1 user and the other 2 are from another it doesn't seem fair to compare that to another state who had 20 racist tweets by 20 users. The whole "study" is flawed and doesn't really tell us much if they are not sorting out multiples. These numbers could be completely different if they did it and I'm not sure why they wouldn't take the small effort of sorting them out.Delete
Rob, you do raise an interesting point about what is the more useful metric to use tweets or users but it does mean that the study is flawed.Delete
Also give the discussion up stream I should not that shifting to a user approach would also reduce the number of Romney tweets and so the ratio between Obama and Romney tweets would remain the same with there being a lot more hate tweets directed at Romney
Matt if you don't sort for multiples what does the study really tell us? A few overly ambitious tweeters could skew the results for a whole state especially for the number of tweets you used. If this is the case what does the study say? That a few people in said states are comfortable saying racist things on the internet many times?Delete
"so the ratio between Obama and Romney tweeters would remain the same" You actually don't know this since you haven't yet separated multiples. This was not my contention anyway.
Rob, this is really a question of what the different measures -- users or tweets -- mean (which btw we do discuss in the post). A lot of tweets sent by a single individual could be interpreted as a much stronger feeling which surely could be more worrisome in some contexts. Or it could be as you posit with a few users "skewing" the results. There's value in both ideas.Delete
But returning to the data, when I aggregated up to the user level we went from 395 tweets to 349 users (a small reduction of about 12 percent) and recalculated the LQs, not a whole lot changed. While Mississippi's LQ dropped somewhat it was still very high and Alabama's LQ for users was actually higher than for tweets. There was some movement in other state scores as well but nothing that changed the overall pattern we posted here.
Just noticed a typo in my 6:12 response above in this thread. It should beDelete
"Rob, you do raise an interesting point about what is the more useful metric to use tweets or users but it does NOT mean that the study is flawed."
That's what I get for posting pre-coffee
I understand. I have trouble remembering who I am, pre-coffee.Delete
"But returning to the data, when I aggregated up to the user level ... There was some movement in other state scores as well but nothing that changed the overall pattern we posted here."Delete
Thanks, Matt. I really appreciate you taking another look at that. Do you think you might be posting that map too somewhere? I'd find it interesting. But it's good to know nothing much moved around.
Yes, as people are pointing out, a few racist tweets don't define a state or region--of course! Still, all intelligent, curious people can do is try to ask interesting questions of the world around them and do their best to use facts to frame the answers. Kudos to you for asking an interesting question and doing your best to frame an intelligent answer with the material you have.
Thanks Matt for looking into it. Glad to know the numbers didn't change that much. I was expecting a drop by a 1/3 or something close. I guess I was wrong.Delete
All of these so call racist who tweeted their stupid ideas is not shocking at all. Ol Miss...what a joke. In time they will realize that racism is played out. The new racism, which they should be really worried about, is not of color but of money. Besides graduating from a school such as Ol Miss won't get you anywhere in life already. It's an old non productive school in an old productive state. Racism is the only thing they have left down there. Ohh an Alabama....what a joke. So in short we are talking about two stupid colleges with no hope in the future to produce or maintain anything productive.ReplyDelete
Let's stay focused on the study rather than decrying individuals or statesDelete
"A score of 1.0 indicates that a state has relatively the same number of hate speech tweets as its total number of tweets. Scores above 1.0 indicate that hate speech is more prevalent than all tweets"ReplyDelete
This should be reworded. It makes it sound like a score of 1.0 means the state had the same number of racist tweets as non racist tweets and anything above means the state had more racist tweets than non racist tweets.
Something better might be: "A score of 1.0 indicates the state's proportion of racist tweets to non racist tweets is the same as the overall national proportion. Scores above 1.0 indicate that the proportion of racist tweets to non racist tweets is higher in that particular state than nationwide."
Question, when looking for the N word, did you look to see the context in which the word was used or just if the word was involved?ReplyDelete
See the reply upstreamDelete
It would be interesting to see the other side, such as tweets with racism towards whites with terms like 'cracker', 'egg', etc.ReplyDelete
Agree. I saw many tweets that night in the 'f--- white people' category.Delete
See my reply upstreamDelete
Another factor might be the fact that the states whose candidate lost were the angrier states. Also since Romney was white the blue states had no target for racism. I'm not saying that is all of it at all but that has to be another factor to consider.ReplyDelete
There is definitely an undercurrent down here in Alabama unfortunately, but I wonder what the numbers would look like by Twitter account by race and then by tweet. Many of my black followers did indeed tweet out language that would have been included in the data. That's where I saw it most often, though I know that people tend to live in their little bubbles largely within Twitter, so maybe I'm missing something.ReplyDelete
What did they say exactly? Were they happy he was elected?Delete
I'm surprised racist are smart enough to use twitter. I guess our education in Georgia is better than I thought....ReplyDelete
Oh, Brian, I love you.Delete
Just a reminder, we're happy to respond to specific questions/objections to this post, rants/flames will be deleted. If things get too bad, we'll suspend comments temporarily.ReplyDelete
I'm from Alabama, and I did not vote for Obama, but that does not mean that I am a racist. I'm sure that this is not what the article and the comment-leavers intend to imply, but there have been a few comments that carry the tune of "sad that more people didn't vote for Obama, hate to live in such a racist state." If I voted for someone else because I don't agree with some of his plans and viewpoints, does this make me racist? Because I've been told multiple times by multiple people that it does, even though I've never once brought up the issue of race. It's really not something I notice. I look at a person's character, not their biological composition, to decide whether or not I agree with them.ReplyDelete
@Liv: In no place did our blog imply that simple disagreement with President Obama makes one racist (full disclosure: I didn't vote for Obama). But we are unequivocal about the fact that using the "N word" or other language used to systematically undermine the entirety of a race is, by definition, racist, especially when put in the context of attempting to delegitimate the election of a black president.Delete
Although I find this topic important, I fail to see how you can make very strong conclusions from such a small amount of data."we took a 0.05% sample of all geocoded tweets in November 2012" And "we are measuring tweets rather than users and so one individual could be responsible for many tweets" makes almost any meaningful conclusions difficult to support. I won't deny that there is racism here in AL; but I think we need an actual comprehensive study to describe it properly.ReplyDelete
Edward, the 0.05 percent sample is for ALL tweets (i.e., not the racist ones) because otherwise it would be millions of tweets and our machines would crash. We're using this sample (n =~ 10000)to normalize the racist tweets.Delete
The racist tweets are NOT a sample but the full population of tweets.
Just a curiosity question. Since you're only looking at geocoded tweets, does that mean the majority of them are coming from mobile devices? If so, could that be skewing the input data by favoring the more affluent, or is it your belief that the distribution of content (including racist remarks) is consistent between both geocoded and non-geocoded tweets?ReplyDelete
Pretty sure the ~3 other tweets near Memphis should be assigned to TN instead of MO and MS.ReplyDelete
Given the relatively small number of total tweets (395) and the fairly large areas you are aggregating to, as well as using counts regardless of user id, I'm kind of worried about MAUP with regards to this visualization.ReplyDelete
I'm sure you can set my mind at ease, though. :)
What's your worry exactly? Happy to respond to anyone who can actually use the term MAUP.Delete
Basically, a couple concerns.
First, MAUP. You're aggregating to the state level, but you're using your point placement as the centroid of whatever area range the geocoded tweet gave you. Now, obviously accuracy varies depending on a huge number of factors, and with large data sets this would probably even itself out (I like to hope, anyway). But with 395 total tweets, any border cases could have a significant impact on the results. For example, that cluster right around the Tenn/Miss border.
Second, and not-MAUP is the lack of control for user id. Again, with a larger dataset I don't think this would matter as much, but in this case it could. Over the course of a week one very vocal, very racist user in any given state location could account for upwards of 1% of your total tweet database (4ish tweets a day). I don't know if this is the case, and hopefully it isn't, but without actually seeing the raw data, I can't know.
Anyway, those are my concerns. I was monitoring election fraud vs. voter id during the election, I'll have those maps up later today or tomorrow. :)
(I think my name is displaying correctly now? OhNo is a spam account I use.)
1) our polygons from which the centroids are derived are generally fairly small so I'm not particularly worried about border issues but you raise a good point. We'll take a closer look. Even if there was some switching the LQs wouldn't change all that much. Still worth checking.
2) Take a look upstream re: user vs. tweets. Switching to users only drops the sample size about 12 percent and doesn't change the LQs that much. In fact it increases it Alabama
Great response, missed the switch to users.
For a second I thought the map legend said "IQ" =)ReplyDelete
We should perhaps study that correlation.Delete
Interesting study but curious. Was the geocode percentages compared on a state by state basis?ReplyDelete
Example if 60% of Alabama tweets are geocoded and New York tweets only 20% are geocoded you should think about accounting for that.
Useful point but unfortunately we don't know the total number of ALL tweets by geography...at present time that's really unknowable.Delete
First, guys, this is awesome. Thanks for the work.ReplyDelete
Second, your study does not come out of nowhere in noting the racist face of this campaign nor in singling out Romney-heavy states as prime offenders. The choice by the Republican party to play the anti-Black card has been widely known since John Kennedy kicked Southern Dems who opposed equal rights out of the party. There has also been much comment that this election may have been the racists' last hurrah. With the continuing shift in attitudes among younger Americans, one hopes so.
All y'all, thank you for this useful study.
This is an interesting study. Even if I think it's flawed. :^)Delete
I'm afraid that anyone thinking racists are having a last hurrah anytime soon is dreaming, though. Racism won't go away until you can get rid of prejudice, fear of the unknown, and fear of that which is different. It can, however, be mitigated and marginalized. Then people will decide it's gone and it will be set free to rear its ugly head again.
This is a cool study.ReplyDelete
That said, I'm curious as to why you used only geocoded tweets as opposed to just taking location info from user profiles. I realize that not all users provide this information but it has to be a higher proportion than those that geocode.
mcbot - our DOLLY project only collects geocoded tweets so that's the main reason, we've looked at using user supplied locations for other work and there are number of issues. Yes, you end with a larger sample but with less confidence in the location.Delete
I'm new to the site so I'm not really aware of what the "DOLLY" project is, obviously something used for grabbing a stream of geocoded twitter data. Probably something that will improve over time as more and more tweets are geocoded.Delete
That said, I don't think a search algorithm with some kind of heuristics designed to determine state would be very difficult. The simplest would just search for state and take it as true and ignore any users with no state information. More complicated ones would look for airport codes within the profile, then city name and scan the feed to try and tie the city name to the appropriate city of that name. I will note that 1. US users seem to be less likely to included location and 2. troll accounts probably tend to lie about location but spurious data points like that just have to be dealt with as spurious data points. Never 100% accurate but even with that caveat, having a sample size of 10,000 is nicer than 390.
Ever see http://www.nohomophobes.com/ I'd like to see it tie tweet location together so stats can be collected in real-time.
I notice Utah (1 tweet) is ranked 2 gradations higher for racism than Arizona (3 or 4 tweets) even though the population of AZ is only a little more than twice that of UT. This can only be because twitter traffic in Utah was lower that night compared to AZ, which is fine, but I wonder what you've done to eliminate the possibility that, on election night, losers might tend to be less enthusiastic about tweeting their loser emotions than winners are about tweeting their winner emotions? In other words, there are racist tweets in both blue and red states, sometimes in and sometimes out of proportion with population, and I think it's possible that red states come out looking more racist mainly because there were fewer exuberant pro-democrat tweets that night to mask them. I think you can only eliminate the possible influence of winner vs loser tweet enthusiasm by adjusting somehow for the election night twitter volume in a state vs the normal twitter volume in that state, and you haven't tried to do anything like that. As a result, I don't think this is well controlled, and so it's unscientific, and I'd say an abuse of data, although I'd be interested in reading your explanation.ReplyDelete
Ned, the data draws from Nov 1 to 7 not just the election nightDelete
Thanks for the clarification. Of course there's this: "Mapping Racist Tweets in Response to President Obama's Re-election" and this "Map of the Location Quotients for Post Election Racist Tweets" and this "During the day after the 2012 presidential election we took note of a spike in hate speech on Twitter referring to President Obama's re-election" so you can see how I might have been confused.Delete
The clarification might kind of put to rest some of my concerns, but it still doesn't explain some of the oddness in your data, such as why you were able to capture about 22 times more tweets per capita in both Nevada and AZ than in Utah, for example. Possibly that's just really the difference in twitter usage between those states, but without some raw data to back it up, it smells fishy. Then too is the sort of quirky characterization of tweets that happened maybe as much as 5 days prior to election day as being "post-election", and as "referring to Obama's re-election". I don't know that a "quirk" like that is completely relevant to your main point, but it might sort of lead one to wonder just how often similar "quirks" might appear in your work.
Re: my last post, Utah vs Nevada might be a better example. Nevada had twice the racist hate tweets, the two states have nearly identical populations (actually Nevada is a bit lower), and yet, like AZ, Nevada ranks 2 gradations lower on your racism scale than Utah does. I think there's a flaw in your methodology when I see something like that. The only things I can think of to explain it are 1) the average person in Nevada tweets 4 or 5 times more often than the average person in Utah. Possible but I think unlikely. 2) the sample size is ridiculously low. Definitely true. 3) your method failed to normalize for large volumes of pro Obama tweets and as a result it systematically biased red states more racist than blue states. Sure enough, Nevada is a blue state, Utah is red.ReplyDelete
Interesting study. I know that your intention was not to imply that those who voted for Romney are racists. It's unfortunate that a lot of left-leaning publications will point to this and say that everyone who voted for Romney is racist. On a side note, I only noticed 0.5 in Oklahoma, however the entire state (Romney won a majority in all 77 counties).ReplyDelete
On the map, you should do an overlay to show where the anti-Romney tweets were coming from.
My other question is this, what percentage of geocoded tweets during this period were racist? Maybe that's there and I'm missing the forest for the trees...
Clarification: The sentence should read: On a side note, I only noticed 0.5 in Oklahoma, however the entire state voted for Romney (Romney won a majority in all 77 counties).Delete
The numbers would have been higher had the survey included the other obvious racist keywords such as 'birth certificate' and 'Kenya'ReplyDelete
I have to say, the map, the process and the reactions for this have all been fascinating. As a Vermonter however, I am curious what happened with our state?ReplyDelete
Syna, Oops...it had zero hate tweets and so it inadvertently didn't make to the list. Now it is fixed.Delete
I would like to know ALL of the words used to define "hate" in this term. I could only find a few listed. It is well known, among those of us who study this, that the term "hate" is thrown around very loosely, just as "xxxphobe" is as well (replace xxx with whatever term is your favorite). Just b/c someone doesn't agree with your position (or candidate) doesn't mean they hate them, or are afraid of them. Maybe their policies and beliefs, but not them.ReplyDelete
for my app, slurtracker.com, i'm using slurs from the slur database, with some primitive validation of context.Delete
I made something similar and it’s still up: http://slurtracker.comReplyDelete
It maps tweets with slurs in them in real time, plotting them on a map.
I’m not using a data-analysis provider or anything, so my data is just real-time results. I never really tried to capture long-term results.
I'd be interested to see how this map compares to one that tracks anti-Mormon sentiment in tweets about Romney.ReplyDelete
We've prepared an FAQ for many of the more common critiques/questions based on this post. SeeReplyDelete
I saw a tweet listed from my hometown, and all full of righteous indignation decided to look it up. It was an African American student at Illinois State University taking issue with the @MoriahRae1 tweet listed first by the Jezebel article, retweeting for context: https://twitter.com/Mr_LQ/status/266046992956944386 . So not all of these results are necessarily indicative of original expressions of racist thought; perhaps filtering for retweets would avoid this.ReplyDelete
I don't believe that most people that opposed Obama are racist. I do, however think that a great many people are prejudiced. Many think because they work with a person of color, treat them equally, that that indicates they have no prejudicial feelings. But, when they go home, the only people they interact with, look like themselves. They teach their children that all people are equal, but their children do not see them interacting or associating with other races. They have never been to a Quiencenera, a Bah Mitvah, celebrated Juneteenth, or been to a Pow-Wow. In fact, they do not venture beyond their own Cul-de-sac, to go to a hip hop concert or a coutry western dance. This sends a cofusing message to children, and until we, as parents, start practicing what we preach, racism will never be obliterated. I recall a mother at Toys-r-Us, having a hissy fit when her daughter wanted a black Disney Princess doll, and she kept trying to coerce her into getting the white one. That little girl was being sent a coded message, no matter what her mother's good intentions were. My home, my daughter's home, is more like the UN. And I am so proud my mother gave me the freedom and the mind set, to pass those same freedoms to my daughter, which she has passed on to her children. Great article! Where do we go from here? And note, socioeconomical status does not necessasarily mean one does not have the ability to tweet.ReplyDelete
This is such BS. Not because White racists use Twitter, but because the number of racist Tweets from Obama supporters is not given equal scrutiny. From my experience, the Black racism on Twitter far exceeds the White racism. Racists will be racists, and The Left will be the Left, and engage in pretenses such as this that exhibit no interest in objectivity.ReplyDelete
By all means, feel free to select an actual logical argument that refutes the statistical evidence. "Well, this is what I see on Twitter from my experience" is not an objective argument... it's conjecture.Delete
Why is it assumed that racism only goes one way? Did you do any type of search with keywords like "white", "Romney", "Mittens", "Willard", "rich", or "lost" - all keywords in talking points used against the Republican candidate? That would be an interesting map, as well.ReplyDelete
LMT makes a good point. I understood your survey to be anti-white in its perspective. There is another side, so why did you not present it also? I do not consider you any type of "news" organization when you present statistics this way. I have lived and toured in other countries in my life and I now live in Alabama. I have met whites who sound like they will go to their graves believing in segregation. But I have also met blacks who have little or no education, are drug abusers yet expect that society owes them more BECAUSE they are black. How can a reasonable person tolerate either mindset? Both sides have to mature in their way of thinking before racism begins to heal.Delete
Quick question: "0.05% sample of all geocoded tweets in November 2012" equals how many tweets overall in the sample?ReplyDelete
What is the national rate of hate tweet to tweet without these terms? Are you using only to election-flavored tweets or any tweets in that time period? The reason I ask is that there were 32 million tweets on the election on election day, let alone Nov 1-6. And the average seems to be about 400 million tweets per day. So are you taking 0.05% of 2.4 billion tweets (400 million tweets per day) or are you taking 0.05% of 32 million + 5 days worth of election tweets (say 100 million or more, guesstimate).ReplyDelete
In short, what is the number referenced here: "# of ALL Tweets in USA"?
To the previous three comments...please see the FAQReplyDelete
I read your FAQ and your attitude indicated in "we focus on racist language directed at President Obama because racism directed at black Americans is not only historically more significant, but because it also highlights the persistence of explicitly racist attitudes in what some have (fallaciously) termed ‘post-racial America’" is that racism is caused by whites. It is not historically more significant. That's a cop-out, spin statement. I could believe that a black person could harbor some ill will just because of past practices alone, not to mention any negative lifetime experiences. I had a little more respect for your article before reading the FAQ. Now you've dis-repected me as a white person. You need to get out of YOUR bubble and improve your perspective. BTW, I campaigned and voted for Obama in this Republican state.Delete
The FAQ does not answer my question. It does not provide a number for how many tweets were in your national sample.ReplyDelete
My reading of your blog post is that you found 395 racist geocoded tweets out of roughly 1-5% of roughly 2.4 billion tweets over 6 days, or 24-120 million geocoded tweets. The 2.4 billion is a guess based on 2012 average of 400 million tweets a day. But I'd like to NOT guess and have a figure for what your sample is, which is not 395.Delete
The authors found 395 racist geocoded tweets about Obama found from November 1-6. By racist, the authors mean white on black racism, as they followed the tweets’ content. (Obama is biracial, of course, but that seems not to matter to those posting on Twitter.)Delete
400 million tweets per day is the 2012 tweet average; making total 2.4 billion, which is perhaps an understatement, but it’s close enough.
The authors estimate that geocoded tweets are 1% to 5% of total tweets.
That makes the pool anywhere from 24 million geocoded tweets to 120 million geocoded tweets.
They sampled 0.05%, so that’s a range of 12,000 to 60,000 tweets.
That makes the percentage of tech-literate racist tweets on election night 3.3% (roughly 1 racist geocoded tweet in 30 geocoded tweets) to 0.66% (or 1 racist geocoded tweet in 151.9 geocoded tweets).
The national average could be 36,000 tweets, 1 racist geocoded tweet per 91.13 geocoded tweets or 1.09%.
However the number of individuals involved must be slightly smaller than that range, because they’re counting tweets, not individuals, so as they state, 395 tweets come from 349 twitter accounts.
So that would give us anywhere from 2.9% to 0.58% of twitterers are racist; alternatively it ranges from 1 racist per 34.4 twitterers to 1 racist in 171.9 twitterers. The average of that range would be 1 in 103 or so.
So, another way of putting this is that maybe 1 in 103 or so tech-literate people are willing to anonymously use hate speech in an uncensored forum. Perhaps the reason this is trending is that people seem to have higher expectations of the tech-literate.
Is 1 in 103 hateful posters in an uncensored forum average, above average, or below average? (1 per 34.4? 1 in 171.9?) Do the authors of this study have any sources to help provide context for this? Academic studies on race online? On trolling rate in communities without moderators?
I would like some clarification. Is this map intended to show racism against Obama as a result of his reelection, or racism in general relating to the election?ReplyDelete
If it's the former, I think you should lead with that point as your argument against including relevant Romney tweets.
If it's the latter, then I make the following argument for including the Romney tweets:
1) While racism directed towards black is a historically significant trend, racism is racism and the only way to eliminate it is to recognize it directed towards all ethnicities. Similarly racism is not defined derogatory comments originating from the dominant group, but the act of disparaging a person based on their ethnicity. Including racist comments directed towards Romney highlights the continuing issues of racial tensions the same as racist comments towards Obama. Residual anger towards an ethnic for past slights is just as much an indicator of racism.
2) The implied weighting of slurs is concerning. Focusing only on slurs and racism tied to historical events implies that less prevalent forms of racism are acceptable. i.e. racism against Blacks and Jews is unacceptable, but it's okay to direct slurs at Gypsies or the Irish because the history isn't as visible.
3) It seems disingenuous to dismiss the Romney tweets because their total number is only a fraction of the tweets directed against Obama using the logic from my first point. Especially when the FAQ points out that positive tweets and users sending out multiple tweets with slurs were not eliminated for the sake of neutrality. If the goal is a complete picture of racist responses to the election without bias, shouldn't we include all relevant tweets?
Although I have a little concern over the sample size (something completely out of your control due to the necessary criteria), I think this is a great map that shows racism directed against a man by a minority of the opposition. If anything, it shows that racism exists even in areas traditionally considered progressive in American politics, albiet in much smaller quantities than areas traditionally associated with racism.
I just think you need to clarify the goal of the map a bit more. I think that will help quiet the rumblings about Romney tweets and the like. It would be interesting to see supplemental maps with Romney+Obama tweets to see how that affects the results and Romney only tweets, if only to see where they originated (I doubt there's enough data to make statistically relevant assertions).
Great map, and great work!
Thanks for all the comments and critiques. They have been much appreciated and have helped push our thinking on this post and future projects. At this point we're seeing more repetition on the same themes and because we've prepared the FAQ that answers these points we're going to turn off commenting. http://www.floatingsheep.org/2012/11/faq-mapping-racist-tweets-in-response.html
Besides it is the weekend and absolutely gorgeous outside so we're unplugging.