The Analytic Turn

The volume and quality of sport analytics writing fills me with awe.

Each day I try to aggregate, curate and share examples from this analytic turn. My activity is partial and selective. I am mindful of a whole community of practice doing similar work. I see lots of comparisons in this activity with walking along striding edges.

My discoveries encourage me to continue with my learning journey and give me that Leonard Cohen feeling expressed in his Preface to the Chinese edition of Beautiful Losers:

So you can understand, Dear Reader, how privileged I feel to be able to graze, even for a moment, and with such meager credentials, on the outskirts of your tradition.

Two papers today reinforced my feeling of grazing on the outskirts.

Mark Taylor wrote about the intelligent use of numbers in football analysis. In his introduction, Mark notes “The use of data in analysing football allows us to take a more nuanced view of both the events that occur during a match and also to evaluate a side’s ongoing performance to make predictions about their future results”.

Mark’s nuanced view included:

  • Awareness of over-fitting non-sustainable events and a flawed projection of a team’s underlying quality.
  • How we might value goal-scoring and develop dynamic probabilities of win outcomes.
  • The use of Poisson calculations.

What struck me about Mark’s observations was the layers of expertise he used to share his story with an imagined audience. He prompted me to return to some of the early discussions about Poisson distributions (Moroney, 1951; Reep, Pollard and Benjamin, 1971; Maher, 1982).

My second paper comes from a friend who asked me to comment on a draft paper he is writing. The paper uses secondary data to explore winning performances in rugby union. What struck me about this paper was the learning journey my friend has made from full-time athlete to service provider as an analyst.

My friend’s paper included:

  • A classification model developed with the R package ‘randomForest‘.
  • The use of the R package ‘rfPermute‘ to estimate the significance of importance metrics for a Random Forest model by permuting the response variable.
  • Visualisation of partial dependency plots with the R package ‘pdp‘.
  • Z scores for McNemar’s test.

I spent a good part of the day exploring the data he shared. I was fascinated by his ease of use of R packages. He took me a long way from the comfort of my qualitative, ethnographic approach to performance observation and analysis.

His paper prompted me to ask him about the reproducibility of his work. I thought he might like another R opportunity … Przemysław Biecek and Marcin Kosiński’s (2017) archivist package designed to improve the management of results of data analysis.

The package enables:

  • management of local and remote repositories which contain R objects and their meta-data
  • archiving R objects to repositories
  • sharing and retrieving objects (and their pedigree) by their unique hooks
  • searching for objects with specific properties or relations to other objects
  • verification of object’s identity and context of its creation.

My friend intends to produce a number of papers about winning performances. Given his expertise in R, the archivist package would seem to be a great way to share his work openly.

Grazing at the margins of Mark and my friend’s work gave me great delight. Both reminded me the kind of learning required to be a polymath at this point in the analytic turn … and the navigation of paths along striding edges.

Photo Credits

Striding edge (The Yes Man, CC BY 2.0)

Mo Fa’asavalu (Charlie, CC BY 2.0)

Reading, Life Paths and Balance

Two UK posts caught my eye over the weekend.

The Telegraph discussed the impact of reading on career path.

Jemima Kiss discussed digital habits in The Observer.


The Telegraph post looked at data from the 1970 British Cohort study reported by Mark Taylor at the BSA’s 60th Conference.These data suggest that:

Of all the free-time activities teenagers do, such as playing computer games, cooking, playing sports, going to the cinema or theatre, visiting a museum, hanging out with their girlfriend or boyfriend, reading is the only activity that appears to help them secure a good job.

 At the age of 16, in 1986, they were asked which activities they did in their spare time for pleasure. These answers were then checked against the jobs they were doing at the age of 33, in 2003.

… there was a 39 per cent probability that girls would be in professional or managerial posts at 33 if they had read books at 16, but only a 25 per cent chance if they had not. For boys the figures rose from 48 per cent to 58 per cent if they read books.

This is the abstract of the paper presented at the Conference:

Digital Habits

My wife Sue shared Jemima Kiss’s post with me. Sue has a remarkable list of RSS feeds that she scans every morning and points me to articles I would not find alone.

Jemima wrote about How I kicked my digital habit. Her introduction is:

We were brushing through wet grass in the early morning when we saw it – a flash of white drifting behind a small patch of trees, backlit by the sun. Crouching down next to my small son, we watched the unmistakable shape of a barn owl until he disappeared into the wood. The look on my son’s face was part of a brief moment of magic, the kind of memory that we live for.

Ordinarily, my next thought would have been to pull out my phone and take a photo, send a tweet or record a video. Connecting is something I do unconsciously now. Tweeting is like breathing and photos and video have documented nearly every day of my 21-month-old son’s life. The meaningful merged with the mundane, all dutifully and habitually recorded – my enjoyment split between that technological impulse and the more delicate human need to be in the moment. This is how we live.

That weekend, however, our whole family – my partner, my son and I – were offline.

Jemima follows up on some ideas shared by William Powers in Hamlet’s Blackberry and discusses how we might exert “a little discipline to restore control over our unsettling, hyper-connected lives.” (For six other books on the Future of the Internet see this post by Maria Popova.)


The juxtaposition of the data from the 1970 Cohort Study and Jemima’s 2011 journey raise fundamental questions about long term development. My own reading spurt occurred from the age of 17 to 22 in a delightful pre-Internet age. I read voraciously now too but do most of that reading online.

Although I have a very large number of social media accounts I am using them less and less. I am a peripheral participant in many of the online exchanges. I am not sure if it is a country life that prompts this peripheral activity. I am certain that it is my wife’s support for a balanced life that is the tipping factor.

A note about the 1970 British Cohort Study

From the Centre for Longitudinal Studies’ website:

BCS70 began when data were collected about the births and families of just under 17,200 babies born in England, Scotland, Wales and Northern Ireland in a particular week in April, 1970.  At this time, the study was named the British Births Survey (BBS), and it was sponsored by the National Birthday Trust Fund in association with the Royal College of Obstetricians and Gynaecologists.  (Subjects from Northern Ireland, who had been included in the birth survey, were dropped from the study in all subsequent sweeps).

Since 1970 there have been six attempts to gather information from the whole cohort, as the chart below shows. With each successive attempt, the scope of enquiry has broadened from a strictly medical focus at birth, to encompass physical and educational development at the age of five, physical, educational and social development at the ages of ten and sixteen, and then to include economic development and other wider factors at 26, 29 and 34 years.

Photo Credits

Sam reading in Badlands