Trying visdat

I found a link to the vis_dat package on CRAN. In my ongoing learning journey in and with R, I am fascinated by the resources that are shared openly … in this case by Nicholas Tierney (link).

vis_dat “helps you visualise a dataframe and “get a look at the data” by displaying the variable classes in a dataframe as a plot with vis_dat, and getting a brief look into missing data patterns using vis_miss.”

I tried it with a csv file of data from the 2019 Asian Cup football tournament. The data include cards given by referees for fouls and other behaviours (including dissent). vis_dat confirmed that the data that are incomplete are for a red card and a second yellow card. Not all cards are red cards or second yellow cards. In my data set I use NA to indicate if a card has NOT been awarded.

An example of the first card given at the tournament:

My data are available as a Google Sheet (link).

The image at the start of this post was produced with vis_dat. I used vis_miss() to visualise the missing data. The function “allows for missingness to be clustered and columns rearranged”.

I am delighted I found this package. I enjoyed reading Nicholas’s thank yous. This underscored for me what a remarkable community nourishes innovation in R.

Thank you to Ivan Hanigan who first commented this suggestion after I made a blog post about an initial prototype ggplot_missing, and Jenny Bryan, whose tweet got me thinking about vis_dat, and for her code contributions that removed a lot of errors.
Thank you to Hadley Wickham for suggesting the use of the internals of readr to make vis_guess work. Thank you to Miles McBain for his suggestions on how to improve vis_guess. This resulted in making it at least 2-3 times faster. Thanks to Carson Sievert for writing the code that combined plotly with visdat, and for Noam Ross for suggesting this in the first place. Thank you also to Earo Wang and Stuart Lee for their help in getting capturing expressions in vis_expect.
Finally thank you to rOpenSci and it’s amazing onboarding process, this process has made visdat a much better package, thanks to the editor Noam Ross (@noamross), and the reviewers Sean Hughes (@seaaan) and Mara Averick (@batpigandme).

Braidwood Showground Parkruns 2018

Parkruns came to Braidwood in 2018.

Thanks to the passion and dedication of a small group of parkrunners (link), the first run was held on 22 September 2018 at the Braidwood Showground (link).

Each runner receives a time for their 5kms run or walk. This provides a great resource for analysis and visualisation.

I have extracted data from the 14 parkruns held at Braidwood in 2018 and hope these might be an alternative data set to the long established Scottish Hill Races data (link). I aim to use these data in the online Sport Informatics and Analytics open course as a supplement to the data already there (link).

  • There is a csv file of 385 runs under 40 minutes on GitHub (link).
  • There is a Google sheet with all the data available (link).

The data in these files is in the public domain and is made available on the local parkrun website (link) as part of the remarkable service provided for parkrunners by an organisation whose mission is “to create a healthier and happier planet”.

There is a Twitter account for parkrun (link) and a sharing of experiences with #loveparkrun (link).

Photo Credit

Runners (parkrun Photos)

Supporting playfulness

Yesterday was a delight day for me. It was bounded by two great examples of playfulness.

The first was at 7.00 am on a cold and windy morning at the Braidwood swimming pool. It was my grandaughter Ivy’s first morning with the swim squad. Ten young swimmers and the coach got the pool ready for the start of the session.

It was the kind of morning no one wants to be first in the water and so all ten jumped in together. They set up the lane ropes as a group with older swimmers helping younger swimmers.

The session got underway with some organisational directions from the coach and then she was able to make observations 1:1 throughout the session. What struck me about the session was the wonderful technical, personal observations the coach was able to make to bring about behavioural modifications but also the joy the eleven participants had on what was a cold, windy morning.

The hour’s session flew by and ended with a mixed-ability relay that the coach managed to equalise perfectly through her choice of swim teams. It was Ivy’s first day, she swam further than she had ever swam in our 18 metre pool. Her only regret was she has to wait five days for the next squad meet up.

The second playful jolt came from a report of a community football team in Sydney in an SBS news report. Dunbar Rovers are a “grassroots club which pioneers fee free football for youngsters” and has a “no-pay-for-play credo despite escalating registration fees”.

One of the club members observed “we have no full time paid staff with people magically doing things. It’s about all working for the common good”. The club has 600 members who have the opportunity to play in one of the 18 senior teams or in one of the 18 junior teams.

Braidwood swimming squad and Dunbar Rovers are 300 kms apart but are very closely connected in playfulness. I think they exemplify the hopes Mark Upton expressed in a recent post. Both clubs do “co-create ways to help people be more human through sport – living and working in fellowship”.

Photo Credit

Dunbar Rovers Juniors full of smiles (Eastern Suburbs Football Association)