Data Fragments

The Centre Pompidou in Paris. People crossing the street with data in the p[icture.

I have managed to read three of the tantilising feeds I received yesterday.

The first was by Prateek Karkare on Decision Trees (link). I found his intuitive introduction very helpful. He started with some binary decision examples then moved on to classification, regression, and learning.

The second was Scott Berinato’s Data Science and the Art of Persuasion (link). In it, Scott observes that organisations:

still expect data scientists to wrangle data, analyze it in the context of knowing the business and its strategy, make charts, and present them to a lay audience. That’s unreasonable.

He proposes “rethinking how data science teams are put together, how they’re managed, and who’s involved at every point in the process, from the first data stream to the final chart shown”.

Scott explores a last mile problem that has existed for a century (“As the cathedral is to its foundation so is an effective presentation of facts to the data”) (link). Scoot concludes that a better data science operation environment needs:

  • A definition of talents rather than team members (management, wrangling, analysis, domain expertise, design, storytelling)
  • Create a portfolio of talents
  • Share experiences and insights
  • Structure projects around talents

With this approach in place:

  • Assign a single, empowered stakeholder
  • Assign leading talent and support talent
  • Co-locate
  • Reuse and template

The third read was Susan Grajek’s The Student Genome Project (link). In her introduction, she observes:

In 2019, after a decade of preparing, colleges and universities stand on a threshold, eager to enter a new era of using technology to unlock our ability to apply data to advancing our missions. That threshold is similar to the one that science faced in the late 20th century: eager to begin using technology to put genetic information to use.

I thought this would resonate powerfully with sport contexts too. Note Susan’s point “We have a growing belief in the value and power of data to understand root causes and improve advice, decisions, and outcomes”.

This resonated very powerfully with me:

our sector faces a daunting preliminary task: we must understand the component parts (find the data, clean it, standardize it, safeguard it); integrate and manage those parts; and find the right tools for these tasks. Just as the big challenge facing genetics in the 1990s was foundational, so is the big challenge that confronts higher education and technology today. After almost a decade of attention and effort, we find ourselves still at the beginning of the data journey—needing to, in effect, “sequence” the data before we can apply it with any reliability or precision.

They are three data fragments but together they have provided me with another delightful day of exploration. I note them her as part of my learning portfolio.

Photo Credit

Photo by Curtis MacNewton on Unsplash

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.