Dwelling on dwell time

I have written two posts on dwell time this week. One on Clyde Street (link) and one for the Sports Wizard blog (link).

I have continued to research dwell time in non-sport contexts. The discovery of Herbert Levinson’s (1983) paper has been very influential in directing my literature searches.

In his paper, Herbert conducted an analysis of transit speeds, delays, and dwell times based on surveys conducted in a cross section of U.S. cities. He concluded that “reducing bus stops from eight to six per mile and dwell times from 20 to 15 sec would reduce travel times from 6 to 4.3 min/mile, a time saving greater than that which could be achieved by eliminating traffic congestion”. He added “transit performance should be improved by keeping the number of stopping places to a minimum”.

I have been thinking about how to visualise stoppages in play in sport. Two of the papers that cite Herbert’s paper offer some insights on how this might be done.

Robert Bertini and Ahmed El-Geneidy (2004) provided a case study of how this visualisation might occur with their estimation of “the values of parameters that affect the total travel time for a particular bus route in Portland, Oregon”. In doing so they shared a trip time model.

Their visualisation of dwell time included:

Mathew Berkow and his colleagues (2007) used colour in their visualisations of transportation in Portland:

Mathew and his colleagues conclude “On the basis of an analysis of 1 year of archived bus dis-patch system data for all routes and stops, the power of using visualization tools to understand the abundance of bus dis-patch system data is demonstrated. In addition, several statistical models are generated to demonstrate the power of statistical analysis in conveying valuable and new transit performance measures beyond what is currently generated at TriMet or in the transit industry in general. It is envisioned that systematic use of these new methods and transit performance measures can help TriMet and other transit agencies improve the quality and reliability of their service”.

These formative discussions about dwell time have really encouraged me to think about pedagogy in sport as well as officiating. In my next dwelling on dwell post I am going to look at referee behaviour at the 2018 FIFA World Cup and the 2019 Asian Cup. It is, I hope, the kind of detailed observation of performance that Herbert, Robert, Ahmed, Mathew and his colleagues might have found interesting.

Photo Credit

Photo by Scott Walsh on Unsplash

#AFLW 2019

The 2019 AFLW season starts on Saturday with the opening game between Geelong and Collingwood (link to fixtures).

I have some data from last year’s regular season (link) curated as secondary data from the official AFLW web site (link).

Median Profiles

A Violin Plot created with BoxPlotR (link). (W1Q is the winning team, L1Q is the losing team).

Plot information

These data have given me an opportunity to postulate some naive priors about when points will be scored in the 2019 season. The probabilities per quarter are based upon game outcome so that the labels ‘winning’ and ‘losing’ relate to the game not the quarter.

Trying visdat

I found a link to the vis_dat package on CRAN. In my ongoing learning journey in and with R, I am fascinated by the resources that are shared openly … in this case by Nicholas Tierney (link).

vis_dat “helps you visualise a dataframe and “get a look at the data” by displaying the variable classes in a dataframe as a plot with vis_dat, and getting a brief look into missing data patterns using vis_miss.”

I tried it with a csv file of data from the 2019 Asian Cup football tournament. The data include cards given by referees for fouls and other behaviours (including dissent). vis_dat confirmed that the data that are incomplete are for a red card and a second yellow card. Not all cards are red cards or second yellow cards. In my data set I use NA to indicate if a card has NOT been awarded.

An example of the first card given at the tournament:

My data are available as a Google Sheet (link).

The image at the start of this post was produced with vis_dat. I used vis_miss() to visualise the missing data. The function “allows for missingness to be clustered and columns rearranged”.

I am delighted I found this package. I enjoyed reading Nicholas’s thank yous. This underscored for me what a remarkable community nourishes innovation in R.

Thank you to Ivan Hanigan who first commented this suggestion after I made a blog post about an initial prototype ggplot_missing, and Jenny Bryan, whose tweet got me thinking about vis_dat, and for her code contributions that removed a lot of errors.
Thank you to Hadley Wickham for suggesting the use of the internals of readr to make vis_guess work. Thank you to Miles McBain for his suggestions on how to improve vis_guess. This resulted in making it at least 2-3 times faster. Thanks to Carson Sievert for writing the code that combined plotly with visdat, and for Noam Ross for suggesting this in the first place. Thank you also to Earo Wang and Stuart Lee for their help in getting capturing expressions in vis_expect.
Finally thank you to rOpenSci and it’s amazing onboarding process, this process has made visdat a much better package, thanks to the editor Noam Ross (@noamross), and the reviewers Sean Hughes (@seaaan) and Mara Averick (@batpigandme).