A Head of Football Analytics

Earlier this year, the canoe slalom program in Great Britain advertised for a performance data analyst.

I though this marked a fascinating change in the sport and underscored for me the opportunities that are now appearing in sport that signal a fundamental shift in how learning journeys are being experienced in industry and in education systems.

This week, Leicester City are adding to this momentum with the advertisement of a Head of Football Analytics opportunity. I hoped the club would extend its expertise in an area they energised with their Tactical Insights Day in 2016.

The Role

  • Produce unique and insightful performance metrics and analysis, using data modelling.
  • Ensure that existing and new databases are maintained and updated promptly.
  • Collaborate with appropriate members of staff at the club, and develop strategies to raise the overall levels of data literacy, analysis and visualisation.
  • Develop the integrated, club-wide approach to providing data driven insights for performance evaluation, player recruitment, sports science and medical aspects of the club.
  • Pro-active in the organisation and implementation of data analysis-based CPD learning activities within the club.

The person Leicester is seeking:

  • Significant experience of working as part of a professional sports organisation, or other sports-related industries.
  • Experience of managing large datasets, and producing high-quality data insights and visualisations for end-users.
  • Experience from other areas may still be considered, based on the relevance to this role.
    Masters or PhD in a numerate subject, this may include Statistics, Economics, Applied Mathematics, Engineering, Computer Science or related subjects.
  • Advanced coding ability (R, Python, XML/XSLT manipulations).
  • Demonstrable working knowledge of databases, SQL and database design.
  • Knowledge of using API’s to manage data sets, and experience using JSON scripts.
  • Familiarity with raw data files such as Opta F24 & TracAb.
  • Good time management & organisational skills, and ability to adhere to deadlines.
  • Excellent written and communication skills in English, with the ability to present results clearly (verbally & visually), and to develop close working relationships with existing staff members with varied levels of data analysis experience.
  • Demonstrable knowledge of football, and of how data analytics are currently being used to impact decision making processes in professional sport.

I noted Ted Knutson’s tweet about this opportunity.

Photo Credit

Leicester City ready for kick off (Ronnie Macdonald, CC BY 2.0)

Data intensive sport

Stephen Downes posted about the mutating metric machinery of higher education this morning.

His post contained links to Ben Williamson’s discussion of the mutating metric machinery and David Berry’s post the data-intensive university.

Ben and David have insights to share with us as we deal with the use of data in sport contexts.

David starts his post with this observation about data-intensive society:

we now live within a horizon of interpretability determined in large part by the capture of data and its articulation in and through algorithms.

He defines data-intensive science as the fourth paradigm in scientific enquiry (the others are: experimental; theoretical; and computational). David suggests:

we are on the verge of a new challenge for the university under the conditions of a society that is based increasingly upon digital knowledge and its economic valorisation.

David’s conclusion led me to think about the transformation of sport and the digital skills required. He argued:

a data-intensive university supports efforts to ensure a new spirit of discovery and the promotion of research through the use of computational techniques and practices which will transform the culture of departments in a university.

Ben noted that contemporary culture is increasingly defined by metrics. He discusses the emergence of a narrative in higher education that it has “been made to resemble a market in which institutions, staff and students are all positioned competitively, with measurement techniques required to assess, compare and rank their various performances”.

Ben links his discussion to David Beer’s (2016) concept of metric power that “accounts for the long-growing intensification of measurement over the last two centuries to the current mobilization of digital or ‘big’ data across diverse domains of societies”.

Ben concludes “A form of mobile, networked fast policy is propelling metrics across the sector, and increasingly prompting changes in organizational and individual behaviours that will transform the higher education sector to see and act upon itself as a market”.

David and Ben’s observations and arguments have a resonance for me in the context of sport. As sport acquires more data in training and competition environments, it is a good time to reflect in a second order way on data intensivity and behavioural change. David and Ben use their insights to investigate higher education but my reading of their posts had me interchanging sport with their higher education contexts and thinking about performance and performativity.

Photo Credits

Photo by ev on Unsplash

Photo by Jovan on Unsplash

Postscript

This is the first time I have used Unsplash photographs in a post. The Unsplash website has this statement:

All photos published on Unsplash can be used for free. You can use them for commercial and noncommercial purposes. You do not need to ask permission from or provide credit to the photographer or Unsplash, although it is appreciated when possible.

Even though credit isn’t required, Unsplash photographers appreciate a credit as it provides exposure to their work and encourages them to continue sharing. A credit can be as simple as adding their name with a link to their profile or photo.

Donald Knuth, basketball and computers in sport

Introduction

Donald Knuth received a scholarship in 1956 to attend the Case Institute of Technology in Cleveland, Ohio. During his time at the Institute, Donald was manager of the basketball team. His interests in observation, notation and computer programming were recorded in an IBM documentary.

Synergy

At Case, Donald combined his interest in computers with his involvement in basketball. He had managed his High School basketball team and took on this role at Case during his undergraduate years (1956-1960).

The Computer History Museum‘s biography of Donald includes this paragraph:

Knuth’s lifelong love affair with computers began as an undergraduate when he discovered the IBM 650 computer system at Case. He quickly mastered the inner workings of the machine and developed a novel program to automate coaching of the school’s basketball team, earning him an appearance on the CBS Evening News with Walter Cronkite.

This novel program is reported in a variety of contexts … all of them I missed in my research into computers in sport.

Donald’s Basketball System

The Internet Archive has a copy of Donald’s chapter 23 in Selected Papers in Fun and Games. In it he wrote:

In high school I’d come up with a general rule of thumb that said,“ Possession of the ball is worth roughly one point, except near the end of a period. ” In other words, if you enter the stadium at a time when your team is leading by a score of 50~49, the effective score is really 51-49 if your team has the ball, but the game is basically tied if the other guys have possession. A corollary of this rule is that field goals don’t really change the effective score! One team gains 2 points, but loses possession, while the opponents gain possession. The score really changes when there’s a turnover, or when a free throw is made, or at the very end.

Of course I knew that this rule of thumb was only a rough approximation; maybe possession was worth only .8 of a point, say. Even so, the person who steals the ball should be rewarded more than the person who makes baskets, contrary to the normal way that players get credit for their contributions.

At Case, “I finally had an opportunity to test these hunches in a quantitative way, and the computer program I wrote was based on those informal notions about possession. To everyone ’s surprise, including my own, the system turned out to be quite successful.”

Chapter 23 provides a detailed account of Donald’s observation system. The records were kept in an “electronic computer”.

The statistics at each basketball game can be taken by two men: a recorder and a spotter. After the game it takes approximately 30 minutes to prepare the necessary totals from the game sheets and about three minutes to punch the IBM cards. One IBM card is made for each Case player who participated in the game, plus a card each for Case and the opponents. Then the machine takes 1.5 minutes to process the game: 30 seconds to take in the “program” of instructions for calculation, 30 seconds to take in the statistical data from all the previous games, and 30 seconds to take in the statistics from this game and to punch the answers. The computer punches four cards for each player, two indicating his performance in this particular game and two containing his cumulative record to date. The cards can easily be printed up for reference and can be filed neatly. Any desired set of statistics can quickly be found from them by passing them through a sorter.

Donald’s real-time notation recorded:

    • Field goals attempted and made (divided into short, medium, and long range).
    • Total free throws attempted and made.
    • Last free throws of a set, made and missed.
    • Total fouls and offensive fouls.
    • Rebounds, defensive and offensive.
    • Violations of rules causing loss of ball.
    • Assists.
    • Loss of ball by fumble, bad pass, or jump ball.
    • Gain of ball by interception or jump ball.
    • Defensive mistakes — allowing opponent to score a field goal.
    • Minutes played.

Donald recorded these data in real time on pre-printed observation sheets. He collected all eleven elements of these data for each Case player and the first six elements for the opposing team.

His program gave each Case player a personal score. Donald noted:

The computer calculates the “true point contribution” of a player by using a rather complicated formula. Using the abbreviations FGA (field goals attempted), FGM (field goals made), FTA (free throws attempted), FTM (free throws made), LFTI (last free throws made), LFTO (last free throws missed), TF (total fouls committed), OF (offensive fouls), OR (offensive rebounds), DR (defensive rebounds), VIOL (violations), AST (assists), FUM (fumbles), BP (bad passes), JL (jump balls lost), JG (jump balls gained), INT (interceptions), DM (defensive mistakes), the player’s “point contribution” rating is:

PC = 2FGM + FTM + 2(AST – DM)

– a(VI0L + FUM + JL + BP + AST + FGM + LFTI)

+ /3(INT + JG + OR + DR + DM – OF)

– 7 (TF) – S (FGA – FGM + LFTO),

where a, f3, 7 , and 5 are weighting coefficients determined by team totals. In the formulas for these coefficients, small letters indicate opponents’ totals and capital letters denote Case totals:

a = 2f gm/ (f ga + viol + of – or + INT + JG + TF – OF);

/3 = 2FGM/(FGA + VIOL + OF – OR + FUM + JL + BP + tf – of);

7 = (ftm – /3(lfti + If to x DR/ (or + DR))) /TF:

6 = a x dr/(0R + dr).

The Archive record includes a copy of Donald’s mimeographed form.

Donald said of his program as an afterthought:

Alas, I have been unable to find any copies of the original program, nor do I recall who carried on with it after I left for graduate school.

My formula for PC should not be taken too seriously. I kept fiddling with it, and never really believed that it was rigorously correct. This work was done long before I had ever heard of Markov processes.

I communicated details of this work to some people at Marquette University in the early 1960s. But it has almost surely had no influence on subsequent applications of computers to sports, except perhaps to stimulate others to do better.

By 1995, professional basketball teams were using computers routinely … (My emphasis)

Analysing Performance

Donald started his basketball analysis at Case fourteen years after Lloyd Messersmith’s dissertation was submitted for examination. In the intervening years there had been significant breakthroughs in computational methods.

A decade after Donald’s experiences at Case, another computer scientist Anatolij Zelentsov was exploring the use of computers in the analysis of association football. This was another iteration in the computerised notation and analysis of performance that provided a ‘functional readiness’ measure for each player coached by Valerij Lobanovs’kyj at Dynamo Kiev.

We are fortunate that we have Donald’s account of his work. You might find Chapter 11 of Donald’s interview on the Web of Stories of particular interest. In it he describes in detail his work at Case.

I found it fascinating to listen to the reflections of his early days in computing after a lifetime of engagement in computer science and the author of The Art of Computer Programming.

I am profoundly sorry it has taken me so long to write about his work. I hope it adds to our knowledge of our origins.