Using Shiny

Mitch Mooney (link) has created a Shiny application for netball. He has aggregated and curated 12,500 data points from publicly available sources.

Shiny is an R package that makes it possible to build interactive web apps straight from R.

Mitch’s Shiny application is a remarkable resource for netball and it provides us with an important example of how to collect and share data. I see it (link) as a great way to support user interaction and inquiry. It is for me a powerful exercise in reader-receptivity

There are thirty-one teams in Mitch’s database going back to 2013.

I share Mitch’s interest in Shiny as a way of making data public and encouraging reflection on those data. Many years ago, I was introduced to Wolfgang Iser (1991) and reader-receptivity criticism. Wolfagang suggested then:

By putting the response-inviting structures of literary text under scrutiny, a theory of aesthetic response provides guidelines for elucidating the interaction between text and reader.

He adds:

If a literary text does something to its readers, it also simultaneously reveals something about them. Thus literature turns into a divining rod, locating our dispostions, desires, inclinations, and eventually our overall make up.

It is this divining rod of dispositions that attracts me to Shiny and the sharing in which Mitch has engaged.

I have looked at Shiny for some time as a way to share data. Recently, I looked at goalkeeper heights at the FIFA Women’s World Cup in France (link). I have also looked at the esquisse package to share data (link).

My interest in Shiny was stimulated by the discovery of the New Zealand Tourism Dashboard (link), “a one-stop shop for all information about tourism”. The dashboard brings together a range of tourism data produced by Ministry of Business, innovation and Employment and Statistics New Zealand into an easy-to-use tool. Information available is presented using dynamic graphs and data tables.

New Zealand government departments maintain fifteen web applications built with RStudio’s Shiny framework.Their main purpose is to make public data more available and accessible for non-specialist users (link).

I see Mitch’s contribution to this sharing as very important and I am delighted he has shared his link to the netball data.

Exploring dplyr

I have been continuing my trial and improvement work with the tidyverse “an opinionated collection of R packages designed for data science” (link).

Today, I have been working through a dplyr vignette (link). I have been mindful for some time that this part of my R use needed significant improvement.

The vignette is really helpful and guided me through some fundamental procedures I should have known much earlier in my tidyverse use of data frames and tibbles (link).

The vignette points out that when working with data you must:

  • Figure out what you want to do.
  • Describe those tasks in the form of a computer program.
  • Execute the program.

The dplyr package:

  • Constrains options, and helps you think about data manipulation challenges.
  • It provides simple “verbs”, functions that correspond to the most common data manipulation tasks, to translate your thoughts into code.
  • It uses efficient backends.

I have created a GitHub repository (link) to share this example. I have attached the csv file I used for the exercise. It is a file from the 2019 FIFA Women’s World Cup in France (link).

I enjoyed working through each of the basic verbs of data manipulation:

  • filter(): select cases based on their values.
  • arrange(): reorder the cases.
  • select() and rename(): select variables based on their names.
  • mutate() and transmute(): add new variables that are functions of existing variables.
  • summarise(): condense multiple values to a single value.

The syntax and function of all these verbs are very similar in dplyr:

  • The first argument is a data frame.
  • The subsequent arguments describe what to do with the data frame. You can refer to columns in the data frame directly without using $.
  • The result is a new data frame

Photo Credit

Opening Game (FIFA)

Final (FIFA)

Cédric’s introduction to R ggplot

Cédric Scherer (link) has written a delightful guide to ggplot. His post is titled A ggplot2 Tutorial for Beautiful Plotting in R (link).

I worked through his post by looking at some of the data from the FIFA Women’s World Cup in France (link) earlier this year.

My exploration of Cédric’s suggestions was definitely of the trial and improvement kind. I did find it one of the best introductory guides to ggplot I have discovered and it helped me build on my eclectic learning journey with this form of visualisation.

The csv file I used for this exploration is available on GitHub (link) and is titled RefereesWWC.csv. My brief R record is:

I looked at five examples from the official FIFA data provided in FIFA’s Match Facts (link). I was mindful that the median ball in play time during the World Cup was 55 minutes and the median time was 97 minues.

1. A geom_point of the referees who officiated at the World Cup and the FIFA record of ball in play time in minutes.

2. A geom_line and geom_point development of visualisation 1 that connects referees that officiated at more than one game at the World Cup.

3. A geom_density_ridges visualisation of ball in play time and total game time.

4. A generative additive model for less than 1000 data points. An outlier, USA v Thailand, is recorded with annotate.

5. An example of a developed geom_density-ridges plot that used the theme_economist visualisation backdrop from the ggridges package. It uses temperature data to look at goals scored in the tournament.

This visualisation provides an opportunity to record with annotation particular games and includes two 0v0 games, the 13 v 0 game and two games involving six goals.

I do recommend Cédric’s post unreservedly. It is a great way for us to develop our use of ggplot as a visualisation tool. The basic code I used for my post is available on a GitHub (link).