I just returned from Madrid, where attended together with Jihyun Kim the Visualizar ’09 workshop at the Media Lab Prado. Jihyun and I are the authors of the FLUflux project, which was accepted for its production at the workshop, process which took place during the last two weeks, from November 14th until 26th. Click on the image below to open the resulting visualization tool, and keep reading the following for the details about this visualization.
Following the premises of Visualizar ’09 – which consisted in the use of data from public adminstrations or scientific research with the goal of facilitating its use, comprehension and presentation – we focused on exploring the correlations between international flight travel data and the outbreaks of global diseases during the last decade: SARS, Avian Flu and the still ongoing pandemic of influenza A (H1N1). Our assumption was that events such as these worldwide pandemics can have an effect on the influx of people travelling between countries, specially if those countries are the affected by the disease. In a more general way, international flight travel could indicate the ocurrence of specific historical events (related to health, politics, economics, etc) by sudden (or more progressive) changes in the total numbers of passanger traveling between countries.
Since the best current public source of international travel information is the US Bureau of Transportation Statistics, this first version of the visualization only shows international flights that start or stop in the US. As we explore other sources of data (such as this or this), we will expand the visualization to include as many international routes as possible.
This visualization works as follows: after a disease (SARS, H5N1 or H1N1) is selected using the buttons on the upper right corner of the screen, a diagram with a center circle surrounded by many circles connected to it by lines is shown. Each point represents a country (the center point being the US) and the connecting line becomes shorter when there are more passangers traveling between this country and the US. Clicking on the line or the circle selects the country, and the lower part of the screen then shows the time series of traveling passangers (US-country top bars, country-US botton bars), on a monthly basis since January of 1990. These time series were aggregated from the “Air Carriers: T-100 International Segment” database from BTS.
From a variety of online sources we obtained the number of monthly cases for each one of the diseases (SARS, H5N1 and H1N1). This data is displayed in the middle bar separating the incoming-outgoing passanger time series, using a color scale going from white (no cases) to red (maximum of reported cases). This color scale is also used to paint the country circles. The size of the circles is scaled with the number of cases in the country:
Since we wanted to reflect the effects of global disease not only at the level of international travel, but also at the levels of media representation and individual, more anectodal responses. For this purpose we used google news archive search and blogs search, to query the number of results for keywords like sars, h5n1 and h1n1 during time and for each affected country. These statistics are overlayed to the travel time series using green (news) and purple (blog) bars which show spikes in the presence of disease-related keyworkds in either news and blogs. The time evolution of these statistics are widely different between SARS and H5N1 (avian flu), as can be seen in these screenshots:
Note that in this version of the visualization, the blogs and news data for H1N1 is still not available. Newer versions will incorporate this data, as well as other improvements (interface, graphics, etc).
During the first week of work at Media Lab we had the collaboration of Larissa Pschetz, who tested many different programming approaches to parse text data and incorporate it into the visualization. Rodrigo Santamaría, one of the technical assistants at Visualizar, implemented the code we used to extract H5N1 data from KML files. The process during the workshop was also documented in the forum and wiki pages.