Writing a report

Earlier this week, Avinash Kaushik wrote about Responses to Negative Data (link). Shortly after his post was published, I found a link to a Turing Institute blog post, written by Franz Kiraly, What is a data scientific report? (link).

Both posts have helped me to think about the why, what and how of sharing observations, analyses and insights.

Franz, the author of the Turing blog post suggest that a stylised data report is characterised by:

  1. Topic. Addresses a domain question or domain challenge in an application domain specific to a data set.
  2. Aim. Data-driven answers to some domain question.
  3. Audience. Decision-makers or domain experts interested in ‘evidence’ to inform decision-making.

Franz suggest five principles that inform good reporting:

  1. Correctness and veracity
  2. Clarity in writing
  3. Reproducibility and transparency
  4. Method and process
  5. Application and context

Whilst there are some issues I take with Avinash’s and Franz’s posts, I do think they both raise some fundamental issues for us as we contemplate sharing our data-informed stories. I am particularly interested in how the curiosity and openness Avinash describes meets Franz’s five principles.

As I was concluding this post, up popped a link to Samuel Flender’s post How to be less wrong (link). This will be an excellent companion to the two posts discussed here. It also gives me an opportunity to extend my interest in Bayesian perspectives.

Photo Credit

Photo by Sandis Helvigs on Unsplash

First Steps With ExPanDaR

Each day, Mara Averick (@dataandme) (link) shares some excellent R advice and links on Twitter. For a while, I bookmarked all her suggestions but there were so many of them that I did not manage to return to them. Even allocating them to bookmark folders did not improve my follow up rate.

For the past month or so, I have been creating R scripts in RStudio each day to try out the coding of some of her suggestions. This was the case today with her link to Joachim Gassen’s ExPanDaR 0.4.0 package (link).

I have a GitHub repository for this exploration to share my csv files and code (link). Like most of my efforts it is just a start … and an attempt to share sport examples.

However, I am really interested in the package’s potential for me to have a first look at data, and if appropriate to work through it with coaches to develop their data dashboard … if they think it can be of help to them.

I used the ExPanDaR’s functions to create: a descriptive table (of all variables); a scatter plot; a quantile_trend_graph (distributions of one variable over time); and a list of the 5 most extreme observations in the data frame. I particularly liked the Shiny opportunities I had to plot variables. I am still trying to work out the tooltip functionality for my descriptive table.

My visualisation examples are:

I am looking forward to exploring these functions and other visualisation functions available in ExPanDaR.