Data and coherent narratives

Peter Killeen (2018), in a paper that discusses the futures of experimental analysis of behavior, observes “we must learn that data have little value until embedded in a coherent narrative”.

The construction of this narrative has been a hot topic this week in conversations about data science activities.

One example is Evan Hansleigh’s discussion of sharing data used in Economist articles:

Releasing data can give our readers extra confidence in our work, and allows researchers and other journalists to check — and to build upon — our work. So we’re looking to change this, and publish more of our data on GitHub.

He adds:

Years ago, “data” generally meant a table in Excel, or possibly even a line or bar chart to trace in a graphics program. Today, data often take the form of large CSV files, and we frequently do analysis, transformation, and plotting in R or Python to produce our stories. We assemble more data ourselves, by compiling publicly available datasets or scraping data from websites, than we used to. We are also making more use of statistical modelling. All this means we have a lot more data that we can share — and a lot more data worth sharing.

Evan’s article concludes:

We plan to publish more of our data on GitHub in the coming months—and, where it’s appropriate, the analysis and code behind them as well. We look forward to seeing how our readers use and build upon the data reporting we do.

The availability of such shared resources, in Uzma Barlaskar’s terms, will enable us to be data-informed rather than data-driven. Uzma suggests:

In data driven decision making, data is at the center of the decision making. It’s the primary (and sometimes, the only) input. You rely on data alone to decide the best path forward. In data informed decision making, data is a key input among many other variables. You use the data to build a deeper understanding of what value you are providing to your users. (Original emphases)

Alejandro Díaz, Kayvaun Rowshankish, and Tamim Saleh share insights from McKinsey research on the use of artificial intelligence in business and note “the emergence of data analytics as an omnipresent reality of modern organizational life” and the consideration that might be given to “a healthy data culture”.

Alejandro, Kayvaun and Tamim suggest that such a culture:

  • Is a decision culture
  • Has ongoing commitment to and conversations about data initiatives
  • Stimulates bottom up demand for data
  • Manages risk as a ‘smart accelerator’ for analytics processes
  • Supports change agents
  • Balances recruitment of specialists with retention of existing staff

Chris Lidner has looked at the profiles of data scientists that become part of an organisational data culture. He reports “data scientists come from a wide variety of fields of study, levels of education, and prior jobs”. They have a range of job descriptions too: data engineer, data analyst, software engineer, machine learning engineer, and data scientist.

The combination of these posts sent me back to re-read Chris Moran’s What Makes a Good Metric? published in August. I think Chris helps us think about our data narratives in the context of “audience, metrics, culture, and journalism”. He points us to Deepnews.ai Project as an example of valuing the impact of journalism to the information ecosystem.

This leads Chris to identify the characteristics of robust metrics that help us understand quality and impact:

  • Relevant
  • Measurable
  • Actionable
  • Reliable
  • Readable

He reminded us also that we should be conscious of Goodhart’s Law: any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes.

As a result of reflecting on these aggregated ideas and discussions, I returned to this diagram presented by Hadley Wickham and Garrett Grolmund‘s data exploration visualisation:

I wondered how this process might change if we start, as Peter Killeen suggested, with an awareness of how we might embed our narrative for a range of audiences in data intensive contexts.

Photo Credits

Basketball photo by William Topa on Unsplash
Person holding four photos photo by Josh Hild on Unsplash

Portals and portkeys

I sat in on a presentation yesterday.

My colleague Scott Nichols , Director of Student Connect at the University of Canberra, shared progress on a new student portal that aims to provide a single point of entry that supports choice of course, enrollment, studying, graduation and on-going alumna/alumnus connection.

The portal will respond dynamically to each student log in and provides an exciting approach to supporting personal learning journeys. I hope this access can be available for the lifetime of the learner.

Scott’s presentation was shared in confidence so I am unable to provide the detail of a platform that will be launched in 2019.

I was fascinated by Scott’s talk and I focused on the personal potential of the platform. It will provide a data rich environment, that with students’ informed consent, could lead to a profoundly ethical resource to support personal learning journeys and personal learning environments.

I believe that the impact of such a portal could be amplified if we are able to appreciate the success of the national Vocational Education Training’s Unique Student Identifier (USI) registration scheme.

At present, six million students who are taking or have taken nationally recognised training opportunities have a USI. This is a reference number that:

  • creates a secure online record of recognised training and qualifications gained in Australia, from all training providers
  • gives access to training records and transcripts
  • is accessed online, anytime and anywhere
  • is free and easy to create
  • stays with you for life

These ten numbers become a portkey in my vision for innovations at the University of Canberra. The USI transcript service that became available in May 2017 underscores this portkey potential.

With the appropriate checks and balances in place, the USI connects school, tertiary and lifelong learning in a wonderfully transparent way.

The announcement of the USI transcript service included these observations:

  • Training participants and graduates can view, download or print their USI Transcript and share it electronically with future training providers if they wish.
  • It will help training participants and graduates when enrolling in further training or applying for jobs as well as support Australian businesses to get a better understanding of their employees’ level of training.
  • The service will enable the Federal Government and policy makers to get a clearer picture of the skills pathways that Australians pursue, and importantly, the ones that work.

In this context, the University of Canberra portal becomes part of a nationwide and global learning network. It has portkey potential (“an enchanted object that when touched will transport the one or ones who touch it to anywhere on the globe decided on by the enchanter).

Documenting and Sharing

Signal Noise, The Economist and Siemens have worked together to visualise the fan energy in FC Bayern Munich’s Allianz Arena.

The visualisation includes: game timelines; fan energy; highlights; players; and social ripple. The visualisation provides the user with a rich array of options.

I think this is a great example of the analytic turn in sport and highlights the data expertise available to sport.

Earlier this year, Signal Noise hosted a Data Obscura exhibition that explored the relationship between data and truth. The exhibition was launched with a panel discussion that considered whether transparency and truth should be the ultimate aim online, and asked “how much is ‘true enough’?”.

This interplay between practice, epistemology and ontology is fundamental to anyone contemplating a career in sport analytics at a time when:

Multiple filters are applied to the information that we see: algorithms distill a world of opinions to give us a distinct view of events, and authenticity is becoming an increasingly scarce commodity. (My emphasis) (Data Obscura, 2018)

This contemplation could lead to a consideration of epistemic cultures and the machineries of knowledge construction. Karin Cetina (1999) writes:

Everyone knows what science is about: it is about knowledge, the ‘objective’ and perhaps ‘true’ representation of the world as it really is. The problem is that no one is quite sure how scientists and other experts arrive at this knowledge. The notion of epistemic culture is designed to capture these interiorised processes of knowledge creation. It refers to those sets of practices, arrangements and mechanisms bound together by necessity, affinity and historical coincidence which, in a given area of professional expertise, make up how we know what we know. Epistemic cultures are cultures of creating and warranting knowledge.

This process involves what Maurizio Ferraris (2006) defines as ‘documentality’. For Maurizio, documents are social objects (such that they involve at least two persons) “characterised by the fact of being written: on paper, in a computer file, or simply in people’s heads”.

His theory develops in three different directions:

  • an ontology (“What is a document?”)
  • a technology (an explaination of how documents are distributed)
  • a pragmatics (an understanding of the efficient distribution of documents)

Sharing the Signal Noise, The Economist and Siemens venture into the Allianz Stadium here has led me to reflect on learning journeys.

The volume and quality of data analysis opportunities positions this generation of data analysts in sport in a very important ontological and pragmatic space.

There are more ways to share primary data and analysis than ever before. Each of us can make an informed and transparent decision about the machineries we choose to construct information sharing and stimulate conversations about knowledge and understanding.

In my case, I use the WordPress blog platform to connect ideas that strike me as important. I discovered news of the Signal Noise project on Twitter. The tweet came as I was re-reading Maurizio Ferraris and editing the Ethical Issues page of the wikiEducator course Sport Informatics and Analytics. In sharing this process openly, I am hopeful that readers can make informed decisions about authenticity and contemplate these issues as worthy of consideration.

Photo Credits

Frame grab Reimagine the Game

FC Bayern (Twitter)