Distances traversed by the 2018 #WorldCup Finalists

I have been looking at the player tracking data shared on the official World Cup website.

With an awareness of the limitations and potential accuracy of the data, I have compiled some basic R visualisations of the data for France and Croatia.

Going into the Final, the FIFA data suggest that in the tournament, the French team has traversed approximately 607,462 metres (a median distance per game of 102,193 metres) and Croatia 723,909 metres (a median distance per game of 118, 220 metres).

All three of Croatia’s knockout games have required extra time. In the semi final against England, Croatia traversed the most distance in any of their games (143,286 metres). In contrast, the data for France’s semi final indicate that they traversed 101,609 metres.

I have aggregated a record of each team’s tracking data on GitHub.

Three French players (Antoine Griezmann, Ngolo Kante and Paul Pogba) have tracking data of over 10,000 metres in knockout games (Antoine and Ngolo twice each and Paul once). No French player exceeded 10,000 metres in the game against Argentina.

In contrast, nine Croatian players have traversed over 10,000 metres. Marcelo Brozovic covered 16,339 metres in the game against England.

The distances traversed by players from Croatia and France are:

In terms of distance order:

Players traversing > 13000 metres per game:

Under 11,000 metres per game:

Photo Credit

Why Brozovic could be the key man for Croatia (The Times, 10 July 2018)

Postcript

My R code is very basic. I am using the World Cup data to develop my use of R.

I am using R version 3.5.1 in RStudio. For the plots, I used the ggplot2 and ggrepel packages. I followed guidelines provided by Garrick Aden-Buie ( Grammar) and Kamil Slowikowski (ggrepel).

An example

#install.packages for ggplot2 and ggrepel

#df is my source file (28 observations of 5 variables)

ggplot(df, aes(Opponent, Distance, label = Player)) +
    geom_text_repel(
       data = subset(df, Distance > 13000),
       nudge_y = 17000 – subset(df, Distance > 13000)$Distance,
       segment.size = 0.2,
      segment.color = “grey50”,
     direction = “x”
) +
geom_point(color = ifelse(df$Distance > 13000, “red”, “black”)) +
scale_y_continuous(limits = c(10000, 17000)) +
labs(title=“World Cup Finalists 2018”, subtitle=“Distances Traversed in the Knockout Phase > 13000 Metres”, x=“Opponent”, y=“Distances Traversed Per Game (Metres)”)

Refereeing at the 2018 #WorldCup (23 games)

The Match Reports from the official FIFA 2018 World Cup website provide excellent sources of secondary data.

I have used these data to profile ball in play times and fouls identified by referees. I have used the ggplot library in RStudio to do this. I used geom_label_repel() for my text labels. I have used a Loess Regresssion to smooth the data series.

Ball in Play

Fouls

I have posted my code and other material in a GitHub repository RefereeingWC18.

Temperature, humidity and ball in play time at 2018 FIFA #WorldCup after 20 games

There is a rich variety of data available on the 2018 FIFA World Cup website.

The FIFA live blog for each game records temperature and humidity.

After 20 games played in the tournament, I thought I would explore these data with regard to ball in play time in each game.

The data and the RCode I used are available on GitHub. This post is another learning out loud approach to my use of R and RStudio.

Temperature and Humidity for each of the 20 games:

Humidity and Ball in Play Time:

Temperature and Ball in Play Time:

These ggplots are created with secondary data. As with all my World Cup posts, I am mindful that I have not investigated the validity and reliability of these data. I do make some basic face validity assumptions about these data.

Photo Credit

_IGP5474 (Victor,  CC BY-SA 2.0)