One of my particular interests in monitoring European football leagues (link), is the identification of dominant game winning performances.
Liverpool demonstrated what I term Type A behaviour in their opening English Premier League performance against Norwich (link).
My data are:
Time of Goals
The scoring of two goals in 19 minutes meant for me that the probability of Liverpool not losing at that point was in the order of 0.992. In the preceding EPL season only three teams in 380 games had given up a 2v0 lead and lost. When the third goal went in after 28 minutes, I had no example of a team losing 0v3 and winning.
I am delighted that the first game of the 2019-2020 season has given me a benchmark, Type A example to consider. I do note that it was a Liverpool home game against a promoted team.
Exactly the fertile ground for Type A expression if a team is serious about its title credentials.
The first Ashes Test was completed at Edgebaston in Birmingham on 5 August (link). Australia won on the final day of the test by 251 runs.
Australia scored 487 runs in their second innings. England required 398 runs to win the game on the final day.
Throughout this cricket summer in England, I have wondered if we can predict the outcome of games early in their play after a team has set a target in an innings.
In this test match, I used Australia’s second innings total as a guide to what England needed to do to bat through the final day. I made the assumption that each partnership for England needed to be 49 runs. I was mindful that England was unlikely to score 496 runs in the day but I did have this linear relationship as a check:
The actual profile on Day 5 was:
These data left me thinking about training and competition and how both teams might prepare for the second test at the Lord’s Cricket Ground in a week’s time.
More generally, the result encouraged me to think about the importance of winning first in a series or a tournament.
I read with great interest Amelia Barber’s post that combined her loves of women’s cycling and data science (link). Since reading her post and writing a brief reply (link), I have been investigating some Union Cycliste International result data.
I did find some women’s road race data available for download (link) and took the opportunity to download the results of the Liège-Bastogne-Liège race at the end of April 2019.
Like Amelia, I am interested in how I might use R with race data to make women’s road cycling more visible. In this race I appear to have a number of visualisations that report chronological age. It is one of the column headings in the data set.
My first attempts include:
These are very basic visualisations of the data but they are for me a start in a public conversation about how we share the available data. The UCI has five pages of women’s road race data going back to October 2018 (link). I sense that an important conversation to have about these data will be web scraping and the possibilities afforded.
I look forward to sharing my analyses on GitHub (link).