Discussing data

A tilt-shift photography of HTML codes

Three posts popped up recently that explored our understanding of data.

In a recent post, Cassie Kozyrkov proposes “we need to learn to be irreverently pragmatic about data” (link).

She observes:

Take a moment to realize how glorious it is to have a universal system of writing that stores numbers better than our brains do. When we record data, we produce an unfaithful corruption of our richly perceived realities, but after that we can transfer uncorrupted copies of the result to other members of our species with perfect fidelity. Writing is amazing! Little bits of mind and memory that get to live outside our bodies.

Cassie notes that when we analyse data, we are accessing someone else’s memories. If we regard ourselves as data analysts then we are engaged in the discipline of making data useful (an in doing so make decisions about analytics, statistics and machine learning). We can demystify data and talk simply about what we do, how we do it, and what we share.

After reading Cassie’s post, I followed up with Nick Barrowman’s (2018) Why Data Is Never Raw (link). He points out:

A curious fact about our data-obsessed era is that we’re often not entirely sure what we even mean by “data”: Elementary particles of knowledge? Digital records? Pure information? Sometimes when we refer to “the data,” we mean the results of an analysis or the evidence concerning a certain question. On other occasions we intend “data” to signify something like “reliable evidence” …

Like Cassie, Nick cautions against “the near-magical thinking about data”. He notes:

How data are construed, recorded, and collected is the result of human decisions — decisions about what exactly to measure, when and where to do so, and by what methods. Inevitably, what gets measured and recorded has an impact on the conclusions that are drawn.

He adds:

We tend to think of data as the raw material of evidence. Just as many substances, like sugar or oil, are transformed from a raw state to a processed state, data is subjected to a series of transformations before it can be put to use. Thus a distinction is sometimes made between “raw” data and processed data, with “raw data” often seen as a kind of ground truth

Nick argues that when people use the term raw data “they usually mean that for their purposes the data provides a starting point for drawing conclusions”. (Original emphasis) He adds:

the context of data — why it was collected, how it was collected, and how it was transformed — is always relevant. There is, then, no such thing as context-free data, and thus data cannot manifest the kind of perfect objectivity that is sometimes imagined

By coincidence, I was reading Will Koehrsen’s suggestions (link) for a non-technical reading list for data science that starts with this introduction:

we can never reduce the world to mere numbers and algorithms. When it comes down to it, decisions are made by humans, and being an effective data scientist means understanding both people and data

I thought all three posts were excellent nudges to enhance our reflexive practice. They reminded me also of EH Carr’s (1961) discussion of historical ‘facts’. He noted that far from being self-evident, historians give facts their significance and do so selectively. They are in effect “a selective system of cognitive orientations”.

Photo Credit

Photo by Markus Spiske on Unsplash

A Head of Football Analytics

Earlier this year, the canoe slalom program in Great Britain advertised for a performance data analyst.

I though this marked a fascinating change in the sport and underscored for me the opportunities that are now appearing in sport that signal a fundamental shift in how learning journeys are being experienced in industry and in education systems.

This week, Leicester City are adding to this momentum with the advertisement of a Head of Football Analytics opportunity. I hoped the club would extend its expertise in an area they energised with their Tactical Insights Day in 2016.

The Role

  • Produce unique and insightful performance metrics and analysis, using data modelling.
  • Ensure that existing and new databases are maintained and updated promptly.
  • Collaborate with appropriate members of staff at the club, and develop strategies to raise the overall levels of data literacy, analysis and visualisation.
  • Develop the integrated, club-wide approach to providing data driven insights for performance evaluation, player recruitment, sports science and medical aspects of the club.
  • Pro-active in the organisation and implementation of data analysis-based CPD learning activities within the club.

The person Leicester is seeking:

  • Significant experience of working as part of a professional sports organisation, or other sports-related industries.
  • Experience of managing large datasets, and producing high-quality data insights and visualisations for end-users.
  • Experience from other areas may still be considered, based on the relevance to this role.
    Masters or PhD in a numerate subject, this may include Statistics, Economics, Applied Mathematics, Engineering, Computer Science or related subjects.
  • Advanced coding ability (R, Python, XML/XSLT manipulations).
  • Demonstrable working knowledge of databases, SQL and database design.
  • Knowledge of using API’s to manage data sets, and experience using JSON scripts.
  • Familiarity with raw data files such as Opta F24 & TracAb.
  • Good time management & organisational skills, and ability to adhere to deadlines.
  • Excellent written and communication skills in English, with the ability to present results clearly (verbally & visually), and to develop close working relationships with existing staff members with varied levels of data analysis experience.
  • Demonstrable knowledge of football, and of how data analytics are currently being used to impact decision making processes in professional sport.

I noted Ted Knutson’s tweet about this opportunity.

Photo Credit

Leicester City ready for kick off (Ronnie Macdonald, CC BY 2.0)

Data intensive sport

Stephen Downes posted about the mutating metric machinery of higher education this morning.

His post contained links to Ben Williamson’s discussion of the mutating metric machinery and David Berry’s post the data-intensive university.

Ben and David have insights to share with us as we deal with the use of data in sport contexts.

David starts his post with this observation about data-intensive society:

we now live within a horizon of interpretability determined in large part by the capture of data and its articulation in and through algorithms.

He defines data-intensive science as the fourth paradigm in scientific enquiry (the others are: experimental; theoretical; and computational). David suggests:

we are on the verge of a new challenge for the university under the conditions of a society that is based increasingly upon digital knowledge and its economic valorisation.

David’s conclusion led me to think about the transformation of sport and the digital skills required. He argued:

a data-intensive university supports efforts to ensure a new spirit of discovery and the promotion of research through the use of computational techniques and practices which will transform the culture of departments in a university.

Ben noted that contemporary culture is increasingly defined by metrics. He discusses the emergence of a narrative in higher education that it has “been made to resemble a market in which institutions, staff and students are all positioned competitively, with measurement techniques required to assess, compare and rank their various performances”.

Ben links his discussion to David Beer’s (2016) concept of metric power that “accounts for the long-growing intensification of measurement over the last two centuries to the current mobilization of digital or ‘big’ data across diverse domains of societies”.

Ben concludes “A form of mobile, networked fast policy is propelling metrics across the sector, and increasingly prompting changes in organizational and individual behaviours that will transform the higher education sector to see and act upon itself as a market”.

David and Ben’s observations and arguments have a resonance for me in the context of sport. As sport acquires more data in training and competition environments, it is a good time to reflect in a second order way on data intensivity and behavioural change. David and Ben use their insights to investigate higher education but my reading of their posts had me interchanging sport with their higher education contexts and thinking about performance and performativity.

Photo Credits

Photo by ev on Unsplash

Photo by Jovan on Unsplash

Postscript

This is the first time I have used Unsplash photographs in a post. The Unsplash website has this statement:

All photos published on Unsplash can be used for free. You can use them for commercial and noncommercial purposes. You do not need to ask permission from or provide credit to the photographer or Unsplash, although it is appreciated when possible.

Even though credit isn’t required, Unsplash photographers appreciate a credit as it provides exposure to their work and encourages them to continue sharing. A credit can be as simple as adding their name with a link to their profile or photo.