The Visual Analytic Turn

Seventeen years ago, Usama Fayyad, Gregory Piatesky-Shapiro and Padhraic Smyth wrote:

Across a wide variety of fields, data are being collected and accumulated at a dramatic pace. There is an urgent need for a new generation of computational theories and tools to assist humans in extracting useful information (knowledge) from the rapidly growing volumes of digital data. These theories and tools are the subject of the emerging field of knowledge discovery in databases (KDD).

I revisited their article in the AI Magazine this week after a number of finds prompted me to think about the visual analytic turn in sport.

The first visualisation that grabbed my attention was an English Premier League fixture strength table prepared by Neil Kellie (shared with me by Julian Zipparo). Neil used Tableau Public for his visualisation.


Neil developed his table by using a static star rating and a form rating combined to give a score for each fixture. This becomes a dynamic table as the season progresses. It has prompted me to think about how we weight previous year’s ranking in a model.

The Economist added its weight to the Fantasy Football discussions with its post on 16 August. The post uses topological data-analysis software provided by Ayasdi to visualise Opta data on the different attributes of players. In an experimental interactive chart:

the data is divided into overlapping groups. These groups contain clusters of data—in this case footballers with similar attributes—which are visualised as nodes. Because the groups overlap, footballers can appear in more than one node; when they do, a branch is drawn between the nodes. Some nodes have multiple connections, whereas others have few or none.


There is a 2m 32s introduction to the Ayasdi Viewer on YouTube. Lum et al (2013) exemplify their discussion of topology with an analysis of NBA roles. Their insights received considerable publicity earlier this year (“this topological network suggests a much finer stratification of players into thirteen positions rather than the traditional division into five positions”).

Back at Tableau Public, I found news of a Fanalytics seminar. One of the presenters at the workshop is Adam McCann.  Adam’s most recent blog post is a comparison of radar and parallel coordinate charts. Adam led me to a keynote address by Noah Iliinsky: Four Pillars of Data Visualization (46m YouTube video). Noah works in IBM’s Center for Advanced Visualization.


This snowball sample underscores for me just how many remarkable people are in the visualisation space. I am interested to learn that a number of these people are using Tableau Public … to share sport data.

In other links this week, Satyam Mukherjee shared his visualisation of Batting Partnerships in the first Ashes Test 2013:


Simon Gleave’s 26 Predictions: English Premier League forecasting laid bare reminded me of the discussions following Nate Silver’s analysis of the 2012 Presidential Elections. I enjoyed Simon’s juxtaposition of 26 pre-season Premier League predictions, “13 which are at least partially model based, and 13 from the media. The models select Manchester City as title favourites but the journalists favour Chelsea”. Simon’s post introduced me to James Grayson and his reflection on predictions about performance. I think Simon and James have a very impressive approach to data.

This week’s links have left me thinking about an idea I had back in 2005. I wondered at that time if I could become skilful enough to combine the insights offered by Edward Tufte and Usama Fayyad. More recently, I have been wondering if I could do that with the virtuosity that pervades Snow Fall.

Visualising Data 130707


Yesterday,  I wrote about Gregory Crewdson.

I am not sure if the documentary had primed me to be on the look out for visualisations of data but I found three interesting examples this week.

Three Visualisations

Battle of Gettysburg

The first is the Smithsonian’s Cutting-Edge Second Look at the Battle of Gettysburg. In the post that shares the second look, Anne Kelly Knowles notes that “the technological limits of surveillance during the American Civil War dictated that commanders often decided where to deploy their troops based largely on what they could see”. She asks “What more might we learn about this famous battle if we put ourselves in commanders’ shoes, using today’s digital technology to visualize the battlefield and see what they could see?”

This approach reminded me of Philippe Mongin’s paper, A Game-Theoretic Analysis of the Waterloo Campaign and Some Comments on the Analytic Narrative Project.

Anne, researcher Dan Miller and cartographer Alex Tait worked together to provide this perspective.

Alex recreated the 1863 terrain based on a superb map of the battlefield from 1874 and present-day digital data. Dan and I captured troop positions from historical maps. Our interactive map shows Union and Confederate troop movements over the course of the battle, July 1 – 3, 1863. Panoramic views from strategic viewpoints show what commanders could – and could not – see at decisive moments, and what Union soldiers faced at the beginning of Pickett’s Charge. You will also find “viewshed” maps created with GIS (Geographic Information Systems). These maps show more fully what was hidden from view at those key moments.

Tour de France 2013

The second visualisation is of the 2013 Tour de France by Cycling the Alps.


I was interested to see the gamification option in this visualisation. The 3D Stage tour and the Stage game both require the Google earth plugin to run the visualisation.

AAR 214

The third visualisation took me to a new world of data. Earlier today, Charlie White posted data about the Asiana Airlines Flight 214 (AAR 214) crash landing in San Francisco. He noted Steve Baker’s use of the flight tracking website FlightAware to compare two approaches to the airport by the same flight number.


The Mashable post has prompted a robust exchange about the validity of these two data juxtapositions.

One comment noted: “The difference between Fri and Sat was wind direction. Today, the wind was extremely shifty, switching almost 180 deg, from N to S. The plane was landing with a sort of tail wind, which every pilot knows is not good. The SFO landing strip is at 135 deg, and the wind was from 210.”

There are other data available from FlightAware.

New Literacies?

Brief Encounters and these three visualisations raise some important issues about the literacies we need to use data rich visualisations. I am fascinated by the skills that deliver these visualisations.

One visualisation that has had an enormous impact on my thinking is the New York Time’s Snow Fall: The Avalanche at Tunnel Creek. I learned with interest that the author John Branch had won a Pulitzer Prize for Feature Writing for the story.

In its nomination submission, The New York Times wrote:

Rarely, we suspect, has there ever been a more fully realized partnership of fine writing and state of the art multimedia put before the features jury, and we encourage you to experience the story the way our online readers did: by clicking on the link submitted here.

I wrote about Snow Fall at the time of its publication. It was a transformational moment for me as I thought about how to re-present data. These three recent examples shared here add to this changing environment.

Photo Credits

Frame grab from Brief Encounter trailer

Frame grab from Smithsonian post Cutting-Edge Second Look at the Battle of Gettysburg

Frame grab from Mashable post FlightAware Shows Path of Crashing Plane original image from Steve Baker.

After Snow Fall

4079094816_9740c4ef90_zThe title of this post reads like a first line of a Robert Frost poem.

However it is a link to my thinking after experiencing John Branch’s New York Times’ story, Snow Fall: The Avalanche at Tunnel Creek.

A post by my son, Sam, moved me on. Sam used his experience to start transforming his About Me page. In it he has explored some of the tools that animated the Snow Fall story.

In his post, Sam observes:

Unsurprisingly, pretty early on I realised that engaging the user in the way the Times article engaged me, was pretty tricky. I persevered and through the use of Google Earth, HTML5 charts and a JQuery plugin called “Parralax” I was able to hopefully bring the page if not to life, then at least to semi-consciousness by amongst other things, making my photos move around, video that fades in and automatically plays when you scroll to a certain point.

In addition to the three tools listed, Sam used:

(I was particularly interested on the fourth item on this list and eventually it led me to David Walsh … and a one hour diversion reading about a remarkable developer.)

A link to today’s Cowbird story by Gemma Weiner brought me back from the world of code to narrative structures. I thought it was a delightful, expressive story … which encouraged me to think even more about the issues Sam discussed and David has explored in his work.

A link from #etmooc sent me off to a December 2012 post by Rachel McAthy. She lists fourteen visual storytelling tools: TimetoastDipity; Google Fusion Tables; Tableau; Datawrapper; Meograph; Storify; Storination; Popcorn; I am familiar with Storify and Cowbird but thist leaves me with twelve new learning experiences.

Other links from #etmooc (@robinwb) introduced me to seven collaborative websites for storytelling, free digital storytelling tools for nonprofits, myBrainshark, ShowMe and reintroduced me to Voicethread.

Just viewing these options was a powerful experience. Before I launch off to try all these I think I will seek out some good examples of use.

I am going to track #etmooc with great interest. Day 1 has taken me on from Snow Fall to a community that will introduce me to remarkable creativity.

Photo Credit

Herbert George Ponting and cinematograph (National library NZ, no known copyright restrictions)