In plain sight

white open book on brown wooden table in front of clear glass window in dim room

The serendipity of finding Thomas Grisold and Alexander Kaiser’s (2017) paper (link), whilst looking for recent discussions about feedforward (link) prompted me to think about personal learning journeys.

Thomas and Alexander ask ‘How can unlearning initiate a deep learning process leading to the best version of our self?‘. The volume and quality of resources available to us makes this a very important question.

Notwithstanding the debate about the concept of ‘unlearning’, two excellent links made me think about my ongoing quest to explore visualisation and a better version of my self as a data sharer and storyteller.

The first was Claus Wilke’s (2018) Fundamentals of Data Visualization (link). His welcome message notes that the book “is meant as a guide to making visualizations that accurately reflect the data, tell a story, and look professional”.

I found it through an alert to chapter 16 of the book, Visualizing Uncertainty (link). The alert came from Matthew Kay (link) whose own work on uncertainty visualisation has also been nudging me to a better version of my self.

The second resource was Amy Cesal’s Sunlight Foundation Data Visualization Style Guidelines (link). Amy worked with Zander Furnas to develop the guidelines. There is a copy of these guidelines on GitHub (link).

Amy reflects on her use of a style guide:

Since having a style guide, I have to do less work on the majority of data visuals, because they are already 90% done when they are handed off, if they are handed off at all. I also spend less time testing for colorblindness and text readability, because I’m using pre-tested options. This way, I have more time to focus on larger projects that push the boundaries of our style guidelines, and really make the visuals exceptional.

Amy’s mention of boundaries is where my reading of Thomas and Alexander meets Claus and Amy.

Access to such outstanding visualisation resources disturbs a learned aesthetic. Thomas and Alexander note that:

Feedforward self-modelling involves constructing a desirable image of the self that represents achievements beyond the individual’s current capability. It yields the potential for improvement and rapid changes of behaviour”.

Just as I was drafting this post an alert to Cole Nussbaumer Knaflic’s (2018) accessible data viz is better data viz (link) appeared in my in box. Cole observes “Often, when we are creating charts and graphs, we think of ourselves as the ideal user. This is not only a problem because we know more about the data than the target user, but because other users might have a different set of constraints than we do.”

My hope is that from the inspiration of these great resources, I can start a process of deep learning about how to share in plain sight … not as a New Year resolution but as an everyday practice.

Photo Credit

Photo by Ilya Ilford on Unsplash

Data and coherent narratives

Peter Killeen (2018), in a paper that discusses the futures of experimental analysis of behavior, observes “we must learn that data have little value until embedded in a coherent narrative”.

The construction of this narrative has been a hot topic this week in conversations about data science activities.

One example is Evan Hansleigh’s discussion of sharing data used in Economist articles:

Releasing data can give our readers extra confidence in our work, and allows researchers and other journalists to check — and to build upon — our work. So we’re looking to change this, and publish more of our data on GitHub.

He adds:

Years ago, “data” generally meant a table in Excel, or possibly even a line or bar chart to trace in a graphics program. Today, data often take the form of large CSV files, and we frequently do analysis, transformation, and plotting in R or Python to produce our stories. We assemble more data ourselves, by compiling publicly available datasets or scraping data from websites, than we used to. We are also making more use of statistical modelling. All this means we have a lot more data that we can share — and a lot more data worth sharing.

Evan’s article concludes:

We plan to publish more of our data on GitHub in the coming months—and, where it’s appropriate, the analysis and code behind them as well. We look forward to seeing how our readers use and build upon the data reporting we do.

The availability of such shared resources, in Uzma Barlaskar’s terms, will enable us to be data-informed rather than data-driven. Uzma suggests:

In data driven decision making, data is at the center of the decision making. It’s the primary (and sometimes, the only) input. You rely on data alone to decide the best path forward. In data informed decision making, data is a key input among many other variables. You use the data to build a deeper understanding of what value you are providing to your users. (Original emphases)

Alejandro Díaz, Kayvaun Rowshankish, and Tamim Saleh share insights from McKinsey research on the use of artificial intelligence in business and note “the emergence of data analytics as an omnipresent reality of modern organizational life” and the consideration that might be given to “a healthy data culture”.

Alejandro, Kayvaun and Tamim suggest that such a culture:

  • Is a decision culture
  • Has ongoing commitment to and conversations about data initiatives
  • Stimulates bottom up demand for data
  • Manages risk as a ‘smart accelerator’ for analytics processes
  • Supports change agents
  • Balances recruitment of specialists with retention of existing staff

Chris Lidner has looked at the profiles of data scientists that become part of an organisational data culture. He reports “data scientists come from a wide variety of fields of study, levels of education, and prior jobs”. They have a range of job descriptions too: data engineer, data analyst, software engineer, machine learning engineer, and data scientist.

The combination of these posts sent me back to re-read Chris Moran’s What Makes a Good Metric? published in August. I think Chris helps us think about our data narratives in the context of “audience, metrics, culture, and journalism”. He points us to Project as an example of valuing the impact of journalism to the information ecosystem.

This leads Chris to identify the characteristics of robust metrics that help us understand quality and impact:

  • Relevant
  • Measurable
  • Actionable
  • Reliable
  • Readable

He reminded us also that we should be conscious of Goodhart’s Law: any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes.

As a result of reflecting on these aggregated ideas and discussions, I returned to this diagram presented by Hadley Wickham and Garrett Grolmund‘s data exploration visualisation:

I wondered how this process might change if we start, as Peter Killeen suggested, with an awareness of how we might embed our narrative for a range of audiences in data intensive contexts.

Photo Credits

Basketball photo by William Topa on Unsplash
Person holding four photos photo by Josh Hild on Unsplash