Showing posts with label user-generated content. Show all posts
Showing posts with label user-generated content. Show all posts

March 25, 2013

What percentage of edits to English-language Wikipedia articles are from local people?

As part of our on-going efforts to explore the geographies of participation in Wikipedia, we have calculated the percentage of local edits to articles about places. In other words, this map illustrates the percentage of edits about any country that come from people with strong associations to that country.

For more on the method that we employed, have a read through the post on "who edits Wikipedia" - in which the data collection efforts have been explained in much detail. The data are undoubtedly somewhat imprecise, but we are confident that they offer us the best overview of the geography of authorship that can be obtained with publicly-available data.

So what do these results tell us?

Unsurprisingly, they show that in predominantly English-speaking countries most edits tend to be local. That is, we see that most Wikipedia articles (85%) about the US tend to be written from America, and most articles about the UK are likewise written from the UK (78%). The Philippines (68%) and India (65%) score well in this regard, likely because of role that English plays as an official language in both countries. But why then do we see relatively low numbers is other countries that also have English as an official language, such as Nigeria (16%) or Kenya (9%)?

We also, interestingly, see relatively high local edit percentages from a handful of countries that don't count English as an official language: Finland (50%), Norway (56%), Romania (54%), and Bulgaria (53%).

Then we also observe large parts of the world in which very few English-language descriptions about local places are created about local people. Almost all of Sub-Saharan Africa falls into this category. The key question is whether these data actually tell us anything meaningful. For instance, just because most edits about the United States likely come from the United States does not necessarily mean that those articles are representative, include a diversity of viewpoints, or fail to exclude people, places, and processes.

But the data nonetheless, in a very broad way, do tell a story about voice and representation. Some parts of the world are represented on one of the world's most-used websites predominantly by local people, while others are almost exclusively created by foreigners, something to bear in mind next time you read a Wikipedia article.

October 17, 2011

Call for Participation: Survey on VGI

Antonella Rondinone and Monica have developed a survey mechanism to understand the discrepancies in who contributes user-generated geographic information. We have blogged about this before, but determined that there is no real answer to as to why demographic difference exists in volunteered geographic information. So, please help us out and give us information about what content you help generate.

Survey in English: https://www.surveymonkey.com/s/GeowebSurvey
Survey in Italian: https://www.surveymonkey.com/s/SondaggioGeoweb

Please pass this along to your friends, parents, neighbors, teachers, students, facebook-friends, twitter-followers, acquaintances, enemies, frenemies and everybody else that uses the internet.

August 30, 2011

Data Shadows of an Underground Economy

Following on from our "Price of Weed" maps featured in the September issue of Wired, we would like to make available the draft report that the maps came from. The full title of the paper is "Data Shadows of an Underground Economy: Volunteered Geographic Information and the Economic Geographies of Marijuana."

Please note that we are still working on the paper (so excuse any lack of polish), but would certainly appreciate any comments and critiques on the draft before we submit it for peer-review.


August 29, 2011

The Price of Weed

We’re very happy to report that a new FloatingSheep map is featured in the September issue of WIRED magazine under the title of "Infoporn: O Say, Can You THC?" [1]. Our map shows the differences in the retail price of marijuana based on user generated reports from the PriceofWeed website. According to WIRED, it offers "a look at the sprawling gray market that gets some high and others heated."

Retail Price of Marijuana Cost Surface
Interpolated from points where n > 2
Green = lower prices; Yellow = higher prices.

One of the things that jumps out clearly is the low prices associated with the marijuana production sites associated with Mendocino, Trinity and Humboldt County in California as well as Kentucky and Tennessee. See this National Drug Intelligence Center report on the distribution of marijuana production by state.

The map featured in WIRED is taken from a much more detailed research paper focused on the potential for user-generated data to shed light on underground economies such as marijuana use. The map relies upon thousands of user reports on marijuana purchases referenced to city locations from the Priceofweed website (see our earlier posting). After cleaning the data to get rid of the outliers, we created a continuous surface using a statistical interpolation technique known as kriging to identify the average variance among price differences through a spherical semivariogram model. To obtain a price for each location show in the map above, an interpolated value was estimated as a weighted average of prices from its twelve neighboring points.

One of the issues in generating these maps is how many observations we would require at each point (or city) before including it in interpolation. Increasing the number of observations (e.g., n > 10) helps control error in the average price at each point but limits the number of points. Lowering the sample size requirement (e.g., > 2) results in more points upon which to base the interpolation but increases price variance. In order to visualize these differences compare the map above (n > 2) with the map below (n > 10). While the first map shows a finer resolution of price variation (albeit with a decrease in the accuracy of the pricing data) it is consistent with the patterns resulting from the rougher resolution in the second map.

Retail Price of Marijuana Cost Surface
Interpolated from points where n >10

We’re in the process of finalizing a paper analyzing this data including a state and city-level multivariate analysis of price. Key explanatory variables in the models include the legality of medical marijuana, level of production and an intriguing distance decay effect as one moves away from Northern California. You can download the draft paper at this link. As always we welcome questions and critiques.

--------
[1] You know that someone had a lot of fun coming up with this title.

May 02, 2011

GoogleMapMaker and Osama Bin Laden

As I was preparing another posting about Google Maps MapMaker (coming in a few days) I noticed that there is a lot of user edits/activity around Abbottābad, Pakistan, the site that Osama bin Laden lived and was killed. There are two sites that have been "selected" as likely locations for the compound although it is not clear if either is the actual location. More generally the town of Abbottābad has received many tags and reviews via Google Maps in reaction. Click on some of the placemarks below to get a sense of these.


View Larger Map

The embedded map below shows one of the locations (see here for a review of the other location) which is actively being tagged. Some of the tags are relatively terse and factual "SEAL TM 6 OBL Take-Down Site" while other postings are celebrating the event. And since it is the Internet, there is plenty of random flaming and spamming going on as well.

The current state of the site can be accessed below (be sure to click on "What's around here") to see all the tags which will likely be cleaned over time. Also a couple of screen captures of some tags that I saw.


View Larger Map


Screen Capture, Google Maps, May 2, 2011 @ 3:56 pm EST
Tag labeled "Bin Laden Pot Farm"



Screen Capture, Google Map Maker, May 2, 2011 @ 3:42 pm EST
Large Arrow Polygon labeled "He is Here" and tag labeled "Future Taco Bell Location"


Screen Capture, Google Map Maker, May 2, 2011 @ 3:44 pm EST
Tag labeled "Osama International Heliport"

November 30, 2010

Geographies of Wikipedia in the UK

After a lot of data cleaning and number crunching, we are able to present the following three maps of the geographies of Wikipedia in the UK using brand new November 2010 data. Looking at the first map (total number of articles in each district), we see some interesting patterns. With a few exceptions, it is rural districts in Scotland, Wales and the North of England that are characterised by the highest density of articles.

What we're likely picking up on is the fact that large districts simply have more potential stuff to write about. If we normalise the map by area we see an entirely different pattern. The map below displays the number of articles per square KM.

We see that most of the large urban conurbations in the UK are covered by a dense layer of articles. Most sparsely populated areas in contrast have a much thinner layer of virtual representation in Wikipedia. There are, however, some notable exceptions. Parts of Cornwall, Somerset and the Isle of Wight all have a denser layer of content than might be expected for such relatively rural parts of the country. On the other hand, one might expect a higher density in the districts surrounding Belfast (in fact almost all of Northern Ireland is characterised by very low levels of content per square KM).

Finally, we can look a the number of articles per person in each district:

Here some more surprising results are visible. All major urban areas have relatively low counts of article per person (with the exception of central London). In contrast, many rural areas (particularly areas containing national parks) have high counts per person.

There are obviously a range of ways to measure the geographies of Wikipedia in the UK. We see that some areas are blanketed by a highly dense layer of virtual content (e.g. central London and many of the UK's other major conurbations). These maps also highlight the fact that some parts of the UK are characterised by a paucity of content irrespective of the ways in which the data are normalised. Northern Ireland in particular stands out in this respect.

We'll attempt to upload similar analyses of other countries in the next few months. In the meantime, however, we would welcome any thoughts on the uneven amount of virtual representation that blankets the UK.



p.s. many thanks to Adham Tamer for his help with the data extraction.

October 15, 2010

More Flickr Mapping

Building on our visualisation of 34 million geotagged Flickr images, we have decided to map the data normalised by population and area. In doing so, some quite interesting patterns are evident.

Flickr Images per 100,000 people
Flickr Images per 100 square km


Predictably, we see some of the same core-periphery patterns that are observable in other types of user-generated content (e.g. Wikipedia). More surprising is the fact that unlike the geography of Wikipedia content, there are a significant number of low-income countries with relative large amounts of content (i.e. images) per every 100,000 people and 100km. Cambodia, Oman, Namibia, South Africa, Nepal and a host of other countries all score highly using these normalised measures.

I would hypothesise that two factors are at play here. First, there are lower barriers to entry on Flickr versus Wikipedia. In other words, despite the openness of Wikipedia, it is still easier to upload geotagged photos to Flickr than to create a new article and defend it's existence against nominations for deletion and overzealous editors. Moreover, the binary developed vs. developing country division has always masked the range of differences between and within countries, e.g., an interesting comparison between Oman and Yemen.

Second, it is also probable that much of the content in low-income countries is created by visitors and tourists. For instance, a significant number of photos geotagged to Cambodia are likely tourist shots of the Angkor Wat temple complex rather than locally created scenes of more everyday events.

Whatever the reasons are, more research is clearly needed on the topic to uncover what the specific biases in authorship are. Furthermore, irrespective of the specific reasons, it remains that these maps continue to show significant unevenness in user-generated content around the world.

For further reading see:

Graham, M. 2010. Neogeography and the Palimpsests of Place. Tijdschrift voor Economische en Sociale Geografie 101(4): 422-436.

Zook, M. and M. Graham. 2007. The Creative Reconstruction of the Internet: Google and the privatization of cyberspace and DigiPlace. GeoForum 38(6): 1322-1343

July 26, 2010

Mapping Flickr

Today's map is a visualisation of all 34 million geotagged Flickr images. The data were kindly collected by Eric Fischer and then aggregated to the country-level (an operation that took our computer about three weeks to process!).

You can see that user-generated images in Flickr display similar geographies to other types of user-generated content (e.g. Wikipedia). In the next few weeks, we'll upload some variations of this map. These results aren't necessarily surprising, but do just reinforce findings that subjects of user-generated content are highly concentrated in only a few parts of the world.

July 05, 2010

Wikipedia and Internet Use

The following map displays the total number of Wikipedia articles normalised by the number of internet users at the country level. The countries with the highest number of articles per 100,000 internet users are Nauru (4667), the Central African Republic (1253) and Myanmar (824). In fact most of the places that score highly by this measure, like the countries listed above, have extremely low levels of internet use per capita.

In contrast, countries with higher level of per-capita internet usage tend to have far lower rates of Wikipedia article per 100,000 internet users (e.g. the United Kingdom (70) and France (67)). While it is entirely possible that the high rates of articles per internet users in some countries is an indication of dedicated Wikipedia editors, it seems instead more likely that Myanmar, the Central African Republic and most other nations with low levels of internet penetration are being represented by editors from outside of their boundaries.

June 21, 2010

Sheep Happens: Finding the "Big Six" of the Farmyard

It should come as no surprise (given the name of this blog) that we're a bit fond of sheep (hey, but not in that way). In that vein, we thought it would interesting to see if the rest of the world shares this predilection.

So, harkening back to the range wars of the American west we searched for the terms chicken, cow, goat, horse, pig and sheep. These animals were selected primarily by what showed up in my daughter's Old McDonald Had a Farm book (this is known more formally as consulting an indigenous source). Although not quite as charismatic as the "big six" of safaris (e.g., elephants, rhinos, lions, cheetahs, hippos and giraffes), the "big six" of livestock makes up for it with our ability to imitate the animal sounds. I challenge anyone to do a giraffe call right now…anyone? I thought as much.

In any case, the distribution of the "big six" at the global level is shown below. Right away we can see that "horse" (in yellow dots) cuts a wide swath through the world; a powerfully pedantic plethora of plentiful placemarks ponyness! Although I'm not quite sure what that last phrase means. Sometimes alliteration wins out over sensibility (I'd apologize but you knew what you were getting into by reading this blog).


Chickens (green dots) seem to be doing OK at the global level, but we fear for the sheep. At a whole range of levels. After all, they were the unwitting (albeit idiot) chorus that drowned out any rational conversation in Orwell's Animal Farm. Hmmm…some interesting parallels with modern politics.

Luckily, the expected center of sheep, the veritable stronghold of storied sheepiness - New Zealand - is well represented with a wooly covering of orange dots. Australia (at least when you get away from the beaches) is not doing too bad either; apparently wool is not what the Aussies wear at the ocean.


The United States replicates the global pattern of horses and chicken. Since the U.S. headquarters of Floatingsheep is surrounded by thoroughbred farms and a mere hour north of the birthplace of Kentucky Fried Chicken, we interpret this as a sign that Kentucky's plans for world domination are well in hand. Just you wait.


We are also relieved to see that the U.S. has a few pockets of sheep out west, but clearly the cows and pigs never had a chance. And the less said about goats, the better.

The most interesting distribution, however, is within Europe which, despite being a very horsey place, still represents a fine figure of fascinating farmyard frontiers. Firstly, we must note the goodly concentration of sheep in Wales and Scotland; no surprise there, but heartening nonetheless. More startling is the popularity of sheep in France (including the island of Corsica). Who knew that amidst the foie gras, frog legs and escargot that such love of sheep was buried?


Alas, the news is not all good, for the pigs have secured a beachhead on Brittany with a thin powerful column heading directly towards the heart of France. What's more, the well established German pig passage (perambulating from Hamburg to Dusseldorf) appears poised to pierce the protective pockets of French Sheepdom!

So sadly, while we cannot foresee (ahem) flights of sheep everywhere, the pigs have not gained controlled as of yet! Sheep of the world, Unite! You have nothing to lose but your fleece!

Beasts of England, Beasts of France-land,
Beasts of every land and clime,
Hearken to my joyful tidings
Of the Golden future time.

Soon or late the day is coming,
Tyrant Pig shall be o'er thrown,
And the fruitful fields of our lands
Shall be trod by sheep alone.*


* Apologies to George Orwell.

June 01, 2010

The Geotaggers' World Atlas (and cyberscapes, too!)

Having just stumbled across another amazing visualization of geotagged photographs, we figured we'd go ahead and share more of the stuff we've been looking at these days. The following map comes from Eric Fischer's The Geotaggers' World Atlas on Flickr, which, you guessed it, maps geotagged Flickr photos. What's so unique about Fischer's series of maps is that he focuses on how fast the photographer was moving when they took the picture by comparing time and date stamps on geotagged photos.

Geotagged Flickr Photos in San Francisco by Eric Fischer

In his maps, black lines indicate walking speed (less than 7mph), while red lines approximate bicycling speed (less than 19mph), blue is for motor vehicles on normal roads (less than 43mph) and green indicates freeways or rapid transit. Based on the repetitive tracing, it's possible to see the places within each city that have been photographed and geotagged most frequently. So how might these concentrations of geotagged Flickr photos compare to our maps of urban cyberscapes around the world?

All User-Generated Google Maps Content in San Francisco
Although the purpose and scale of these two visualizations are different, they both show a roughly similar concentration of user-generated content (in either Flickr or Google Maps) around Market Street in San Francisco. Since Fischer did this exercise for 50 different cities around the world, some of which we've already mapped using our own method, the comparisons between the two can go on and on.

Let us know if you find anything else interesting!

May 27, 2010

World touristiness map

Ahti Henla has produced a fascinating mashup that allows people to view the density of Panoramio images across the planet. The patterns in his visualisations look remarkably similar to the geographies of content in Google and Wikipedia that we have previously mapped out.

Head over to Ahti's site for the interactive map, or play with the KML file that he has also usefully made available.

May 18, 2010

Mapping the Bluegrass cyberscape

Although it's been quite a while since we last posted our metro-level cyberscape visualizations, we figured that now was as good a time as any to bring them back. In some of our previous posts, we mapped the total number of user-generated Google Maps placemarks in our sample cities, along with some Crescent City culture-specific maps of New Orleans for Mardi Gras and other interesting examples from around the world.

Below you'll find maps depicting the location of all user-generated placemarks (using the keyword "1") and placemarks referencing "crime" in Lexington, Kentucky. Although Lexington doesn't hold much, if any, significance for most of our readership, it presents an excellent opportunity to ground truth these virtual references by comparing them to our collective experiences as current and past residents of "the Horse Capital of the World".

All User-Generated Content in Lexington KY

User-Generated References to "Crime" in Lexington KY

In the first map, the highest concentration of placemarks exists in downtown Lexington. More specifically, the points with the most placemarks (shown in red) are at the intersection of Limestone and Main Streets, a primary intersection in the city and the site of Phoenix Park (formerly the Phoenix Hotel) and the city's courthouses.

While the spatial pattern of all user-generated content is not surprising in the least, and largely mirrors what has been seen in other urban areas, the concentration of placemarks referencing "crime" is significantly more interesting. Rather than being a mirror of the more general pattern focused on the city center, placemarks referencing crime are focused on the Kirwan-Blanding residential complex on the University of Kentucky's South Campus.

Although this concentration isn't necessarily surprising, given the fact that the Kirwan-Blanding complex has been the site of some significant violent crimes, along with almost innumerable incidents of public intoxication and drug possession, this does represent an important deviation from common patterns of concentration within city centers, as was evidenced by the map of all placemarks in Lexington.

March 19, 2010

How does the density of placemarks vary across space?

One of the most fundamental questions in our research is also one of the most basic. How does the density of placemarks vary over place? Back in June 2009, we took an initial look at information inequalities but had to rely on keyword searches for "0" and "1" (based on the assumption that there would be no particular spatial bias to these terms) as proxies for the total amount of content produced about a place. It worked fairly well but was less ideal than we hoped.

Recently it became possible to conduct wildcard searches (using the "*" operator) and this post revisits the same question, How does the density of cyberscape vary across locations? We conducted a wildcard search at approximately 260,000 points on the Earth's surface and collected the total number of placemarks indexed there. As always, a direct observation is preferable to a proxy measure so we're quite excited by these maps.

One sees that the United States contains the most placemarks (77 million) with almost twice as many as China which has 43 million. The only other countries that also have over ten million placemarks are the usual suspects when it comes to technology use: Germany, Japan, the UK, France and Italy. However, looking at the raw number of placemarks per country only tells part of the story. So, we decided to normalize these data by population and area. In doing so, some interesting patterns emerge.

Most countries in western Europe have extremely high levels of user-generated content per person despite having fewer placemarks than countries like China or the US. Denmark in particular stands out as having the world's highest ratio of placemarks per person. We're not sure why the Danes are so well represented in cyberscapes. Perhaps Danes have the perfect combination of high levels of disposable time and income to allow them to engage in the construction of user-generated content (the country has the world's highest level of income equality, a large welfare state and one of the highest levels of internet access). An alternate theory (which we're not putting a lot of store in) rests on the well established fact that all things internet-related can usually be explained by pornography. Denmark was the world's first country to legalize pornography and, as such, it stands to reason that they have a head start when it comes to producing content for the internet. We should point out that we haven't yet had a chance to explore the actual content that the Danes are producing.

Moving swiftly on, it is remarkable that China, despite being home to 1.3 billion people, continues to have a relatively high ranking when the data are normalized by population. The finding is a testament to the enormous amount of content being created about China. Interestingly in many of our maps so far, China has not shown up very strongly but this is likely connected to our focus on English search terms. For instance, we're currently searching using the Chinese characters for temple which is producing some interesting patterns that are also much denser than the searches on the English word temple.Finally, we decided to normalize the data by area. Here, very different patterns emerge. Small, densely populated countries like the Maldives and Singapore rise to the top of the list. Much of Europe as well as Japan and South Korea also stand out as having a large number of placemarks per square kilometre.

These maps show that there is no single way to represent the multiplicity of the world's cyberscapes. Depending on the particular way that these cyberscapes are measured and normalized, some quite different results can be found. And yet, irrespective of how the data are measured, a general 'digital divide' can be observed in these virtual representations of place. Western Europe, North America and parts of East Asia are represented by a significant amount of virtual content, while much of the rest of the world (in particular most of Africa and the Middle East) remains, both literally and figuratively, off the map.

March 17, 2010

Mapping Christianity

Last week's New Technologies and Interdisciplinary Research on Religion was a fascinating collection of work in this area. Historians, data visualizationists, linguists, sociologists, economists, etc. presented on a wide range of topics which really worked well together. You can find our presentation here.

So after the last week of alcohol and drug related postings I guess you can say that we've found religion! Hallelujah! And returning to our earlier analysis of the cyberscapes of religion, the following three maps take a more fine grained look at representations of Christianity on the internet.

The first map displays references to four types of Christianity (Catholic, Orthodox, Pentecostal and Protestant) at a global scale. Vivid patterns are visible on this map. References to "Catholic" dominate in many places. Of course, those who are making placemarks may be more likely to refer to a specific Protestant denomination (e.g., Methodist, Baptist, etc.) which would serve to overstate the level of Catholicism.

However, there are clear clusters of the three other types of Christianity. Most interesting is the fact that references to "Pentecostal" are more visible than references to "Catholic" in most parts of Brazil (and large parts of South America) despite the fact that almost three-quarters of Brazilians identify as being Catholics. Part of the issue is likely down to the fact that we thus far have confined our searches to English-language terms and are therefore missing out on all the references to Catholicism in Spanish. However, it is intriguing that Pentecostalism is so visible in Brazil (perhaps because it is rapidly growing in popularity in the region).

Taking a closer look at Europe, there is a fascinating split between Orthodox Eastern Europe, Protestant Germany, and Catholic everywhere else. In places such as the UK that contain more Protestants than Catholics it is likely that people aren't using the actual term "Protestant" as a signifier of their religion.

Too combat this issue of Protestantism being an overly general term that few people associated with, we also looked at a broader range of terms related to Christian denominations in the US and discovered patterns that are incredibly clear. Catholics are most visible in much of the Northeast and Canada, with Lutherans taking the Midwest, Baptists the Southeast, and Mormons unsurprisingly taking much of the mountain states. Methodists, interestingly, seem to primarily be most visible in a thin red line between the Southern Baptists and everyone else. The obvious (and farcical question) is against whom are they forming a defensive barrier?

Our readers might also be interested in the fact that there are parts of the country in which the Amish are most visible in religious cyberspaces: a somewhat surprising finding given the fact that they are not supposed to be using contemporary technology - let alone be annotating Google placemarks.

March 01, 2010

Rich and Poor Placemarks

So what happens when you search for user generated placemarks containing the words rich and poor? We didn't know but now we do.

Overall the world of user generated data seems to be a fairly rich place. Which is not altogether surprising since the ability to even create a Google placemark (access and ability to use a computer) suggests a certain level of affluence in a world where half the population lives on less that $2.50 a day. That's one reason why much of the globe doesn't have any placemarks at all.

Global Map of Rich and Poor


So it makes most sense to more closely examine East Asian countries of Japan, South Korea (note the clear difference with North Korea) and Taiwan are mostly spotted with "rich" placemarks. Likewise in China (which doesn't have many placemarks in general, a topic for another posting) "rich" is associated with the wealthy coastal regions such as the economic powerhouse of Shanghai, Fuzhou and Ghangzhou.

East Asian Map of Rich and Poor


Moving westward one sees that Europe is much more placemarked (is that even a word?) in general than Asia. But within this, there are interesting patterns as one moves south, east and north from the historic core of Europe. France, the Benelux countries, Germany and Italy systematically have more placemarks referencing rich than poor. But as one moves into the areas of Spain/Portugal and Greece/Turkey, the pattern becomes more varied. There are both fewer placemarks in general and those that do exist are more likely to have references to poor. Perhaps the most striking example is Britain with the core region around London tagged as rich and as one moves northward there is an increasing amount of placemarks referencing poor.

European Map of Rich and Poor


The pattern in the North American context is much less clear. One can see the Northeast (stretching from Massachusetts to DC) is primarily tagged as rich. This tendency toward rich is mostly maintained along the entire coastline. Moving inland, the patterns become much less clear, with the rest of the country seeming to be a nearly equal combination of rich and poor.

U.S. Map of Rich and Poor

February 10, 2010

Where Users Like to Vacation

Over the past few months, we've published a number of maps showing the automatically- and user-generated online representations of place, from the seedy to the holy to the hoppy. Perhaps you've found yourself thinking, "I'd sure like to go there!", wherever there may be. So where exactly is it that people want to go?

The following maps show the incongruities between these automatically- and user-generated representations of place when searching for "tourism" and "vacation" in Google Maps. The values in each of the four maps were normalized using the national average for each search term, with any points not 20% greater than the average (indexed value >1.2) being excluded. These maps thus specifically show the places in which there is a higher-than-average concentration of placemarks (either user-generated or directory) mentioning the words "tourism" or "vacation".

Tourism: Directory

Tourism: User-Generated

Perhaps the starkest contrast between these maps of tourism is the much smaller number of user-generated placemarks as compared to the automatically-generated directory placemarks, usually drawn from pre-existing sources like the Yellow Pages. In moving from directory to user-generated representations, almost all rural locations disappear from the map, although the vast areas west of the Mississippi River with no information at all show that even some urban areas don't possess larger-than-average amounts of tourism-related information.

Vacation: Directory

Vacation: User-Generated

Shifting our attention to searches for "vacation", it is interesting that in this case, user-generated representations still have considerable coverage across the United States. Moreover user generated references to vacation differ from the "official" map of vacation based on Google Maps directory listings.[1] That is, "vacation" shows up most often in New York City in the Google Maps directory but user-generated representations show that Orlando, Florida, the home of Disney World, is the place to go on your coveted break each year.

God help us all.

Take note as well, that coastal areas all across the United States are prominent in the peer produced constructions of vacation, from the coastal Carolinas and Georgia to the Gulf Coast, and even throughout California, Oregon and Washington. So perhaps there is hope of eluding our mouse overlords after all.

Most importantly, these maps call our attention to the significant variances in how place is perceived online, depending on what measures are being used to represent these constructions. Even if it's possible to dig a hole through the planet on Google Earth, the difference between, and within, places remains as important as ever.

[1] This is also one of the few cases in which the maximum value in a map deviates from one of the nation's largest urban areas.

February 03, 2010

Informal versus OFFICIAL Fun & Vacation

In some of our previous explorations of user-generated representations of place, we looked at the "funnest" places in North America. Although Toronto was by far the "funnest", when normalized by population, Cape Cod Bay looked like the best place to go for a good time (because, you know, nobody actually lives in the bay). In order to get a better grasp on where the fun is and from whence it comes, we compared our previous data on user-generated Google Maps placemarks mentioning the word "fun" with listings in the Google Maps directory that mentioned "fun". In short, how do informal notions of fun (user-generated) compare to OFFICIAL fun (directory listings).

Locations of Informal (user-generated) and Official (directory listing) Fun

Perhaps unsurprisingly, Toronto again appears prominent because of the prevalence of informal, as opposed to official, "fun". Toronto is largely an anomaly among urban areas in North America as most cities are decidedly tilted towards fun of the official type variety. Clearly we need to do some fieldwork in Toronto!

Likewise a considerable area in the rural western U.S. also displays a favoritism towards user-generated/informal fun. Upon further examination, many of the areas displayed in orange above correlate to the locations of US National Parks and National Forests. Because there would be few, if any, directory listings in these protected areas (as opposed to urban areas, which would have a much larger directory), user-generated placemarks are more prevalent than those generated automatically using sources like the Yellow Pages.

Also of interest is the high correlation in the differences between user-generated and directory content for "fun" and the differences between user-generated and directory content for "vacation" (see below). Here again wide swaths of the western US have more user-generated vacation reference than directory content, despite the general trend across the U.S. and Canada being the opposite. One site that shows up prominently as a cluster for user generated fun and vacation is Wall, South Dakota, home to the famous Wall Drug. Many a weary traveler driving across the country on I-90 have sought a few hours of refugee/distraction here. And apparently many have chosen to document it as well.

Locations of Informal (user-generated) and Official (directory listing) Vacation

So even if only in the relative prevalence of user-generated representations of places that are both fun and good for vacationing (don't they go so well together?), rural areas have found their place in the American cyberscape.

January 24, 2010

Where do people Make it Rain'?

I make it rain. I make it rain on them.
-Fat Joe featuring Lil' Wayne, "Make it Rain"

No surprises here (except for FloatingSheep's mastery of slang). The folks in Las Vegas make it rain. No, not precipitation. The kind defined by the Urban Dictionary as "When you're in da club with a stack, and you throw the money up in the air at the strippers. The effect is that it seems to be raining money." Indeed.

It shouldn't startle anyone that the largest city in the only US state where prostitution is legal also has the most user-generated references to strip clubs. Contrasting its usual ranking in the urban hierarchy of user-generated geographic information (i.e., somewhere in the middle), Las Vegas is undoubtedly considered by the collective intelligence of the Internet as the place to go to see the clothes come off.

But it is also clear that this phenomenon is national with clusters of strip club reference throughout the U.S. with Florida, Chicago, Detroit, Toronto, Montreal, New York-New Jersey (Bada Bing!) and Portland standing out in particular. Does Las Vegas retain its penchant for seedy entertainment when the raw number of hits are normalized by both the average number of mentions of 'strip clubs' in user-generated placemarks and the relative specialization at each point (values divided by the number of mentions of "1")?

Even when the raw values of user-generated placemarks are normalized by these two measures (with values showing less-than-average specialization excluded), Las Vegas remains the national hotbed for strip clubs by a considerable margin. But what explains the relative prevalence of strip clubs in the area around Aiken, SC? Or most of Connecticut, for that matter?

Clearly further research is needed but that's NOT what we mean. We're more than content to let it remain one of life's little mysteries for now.

January 13, 2010

Floatingsheep in the Lexington Herald-Leader, or Santa Likes it Hot

Sometimes analog is better than digital. This is especially true when being featured on the front page of the the Lexington Herald-Leader on Christmas Eve. Belated scans of the newspaper article about our Christmassy maps are below...