Scoring First and Losing: European football leagues 2018-2019

Early exposure to Charles Reep’s work has led me to a four decade fascination with the observation and analysis of goal-scoring in association football.

I have a particular interest in those games in which a team scores first and loses.

So far, in five of Europe’s major leagues (Bundesliga yet to start) there have been seven games of this type:

Time in minutes to equalise (TE), team that scores first (TSF), team that wins (TW).

I have explored these games with some R packages in R Studio. The packages are: tidyverse, ggplot2 and ggrepel.

My basic visualisations:

The Games

The scores in these games

The teams that overcame conceding the first goals:

The basic code I used to create these visualisations:

df <- read.csv(“SFL.csv”)
df %>%
ggplot(df, aes(x=League, y=TE, label = Game)) +
  geom_point(colour = “firebrick1”) +
  geom_label_repel(size = 3, colour = “black”, fontface =  “bold”) +
  ggtitle(“Games in which team scores first and loses”) +
  labs(subtitle = “European Leagues 2018-2019”) +
  xlab (“League”) + ylab (“Time Taken to Equalise (Minutes)”) +

Photo Credit

AFC Bournemouth (Twitter)

Scoring first in football project: six European leagues

I have been monitoring scoring first in association football.

In the 2017-2018 season I used the website to record goal scoring timings in six European leagues: EPL; Ligue 1; Bundesliga; Serie A; Eredivisie; and Primera.

I looked at the following events in games played in these leagues:

  • score first win (SFW)
  • score first draw (SFD)
  • score first lose (SFL)
  • 0 v 0 game (0goals)

I wondered if these measures could inform a naive Bayes approach to probabilistic behaviours in association football. I am thinking that for the 2018-2019 season I will use these prior probabilities to monitor teams’ progress.

For the six leagues combined, I have these median probabilities:

  • SFW 0.64
  • SFD 0.19
  • SFL 0.12
  • 0goals 0.07

I have a measure for scoring first and not losing (SFNL) (a sum SFW + SFD) and my median for the six leagues is 0.82.

My median probability profiles for each of the six leagues is:

Ligue 10.640.
Serie A0.650.

For the champions of each league, my probability profiles are:

Man City0.760.

For the bottom team in each of these leagues, my probability profiles are:

W Brom
FC Metz0.
FC Koln0.
FC Twente0.

I am hopeful that these data might be of interest to anyone undertaking a Bayes approach to goal scoring.

In my own work, I am keen to see how early we can confirm the likely trajectory of any team in each of these six leagues.

My next post in this series will share data from the last two FIFA World Cups.

Photo Credit

Supporters FCN (Manuel, CC BY-SA 2.0)

Six European Football Leagues Going into Christmas 2017

I have been following scoring patterns in six European football leagues (EPL, Ligue 1, Bundesliga, Serie A, Eredivisie and Primera) in the 2017-2018 season.

I have a particular interest in the outcome of scoring first and not losing in games in these leagues.

Prior to midweek games on 13 December 2017, the range of my data (n=875 games) thus far is:

In Ligue 1, the % of games in which the team that has scored first and not lost has ranged between 86% and 89%. In Serie A, the range is 72% to 80%. The other four leagues fit between these two leagues.

My BoxplotR visualisation of nine observations for these leagues is:

The box plot statistics are:

The EPL is about to enter an intense fixture period and I will be interested to observe any changes in pattern.

A separate project is to examine the games in which the team that scores first has lost (n=96 of the 875 games played). The Eredivisie has the largest number of these games (n=23 out of 134 games) and the Primera the smallest number (n=12 out of 150 games).

Photo Credit

Marco Verratti (PSG Officiel, Twitter)

Fireworks (AjaxDaily, Twitter)