Forecasting, predicting, classifying and uncertainty

This is a post to share my bumping into work by Glenn Brier, Frank Harrell and David Spiegelhalter. It coincides with an email exchange I had with Tony Corke about how to share posterior outcomes in the context of prior statements of probability.

I saw a reference to a Brier Score in David Glidden’s (2018) post on forecasting American football results (link) and followed up David’s reference to a Brier score in Wikipedia (link). This encouraged me to seek out some of Glenn Brier‘s papers to find the origin of the score named after him.

An early paper was written in 1944 for the United States Weather Bureau. It was titled Verification of a forecaster’s confidence and the use of probability statements in weather forecasting (link). In the introduction to that paper, Glenn observed “one of the factors that has contributed to the difficulties and controversies of forecast verification is the failure to distinguish carefully between the scientific and practical objectives of forecasting” (1944:1). He proposed that:

  • The value of forecasts can be enhanced by increased use of probability statements.
  • The verification problem can be simplified if forecasts are stated in terms of probabilities.

He added “there is an inherent danger in any forecast if the user does not make use of (or is not provided with) the pertinent information regarding the reliability of the forecast” (1944:7). The sharing of information about the reliability of the forecast makes it possible to provide recommendations for action. Glenn concluded “the forecaster’s duty ends with providing accurate and unbiased estimates of the probabilities of different weather situations” (1944:10).

In a paper written in 1950, Verification of forecasts expressed in terms of probability, Glenn provided more detail about his work on probability statements and presented the details of his verification formula (link). He proposed ” perfect forecasting is defined as correctly forecasting the event to occur with a probability of unity” with 100 percent confidence (1950:2).

The 1950 paper raised the question of skill in forecasting. A decade later, Herbert Appleman (1959) initiated a discussion about how to quantify the skill of a forecaster (link). Glenn’s 1950 paper prompted Allan Murphy (1973), amongst others, to look closely at vector partitions in probability scores (link). Some time later, Tilmann Gneiting and Adrian Raftery (2007) considered scoring rules, prediction and estimation (link).

Frank Harrell fits into this kind of conversation in a thought-provoking way. Earlier this year he wrote to distinguish between classification and prediction (link). He proposes:

Whether engaging in credit risk scoring, weather forecasting, climate forecasting, marketing, diagnosis a patient’s disease, or estimating a patient’s prognosis, I do not want to use a classification method. I want risk estimates with credible intervals or confidence intervals. My opinion is that machine learning classifiers are best used in mechanistic high signal:noise ratio situations, and that probability models should be used in most other situations.

He concludes:

One of the key elements in choosing a method is having a sensitive accuracy scoring rule with the correct statistical properties. Experts in machine classification seldom have the background to understand this enormously important issue, and choosing an improper accuracy score such as proportion classified correctly will result in a bogus model.

There is a reference in Frank’s paper to risk (“by not thinking probabilistically, machine learning advocates frequently utilize classifiers instead of using risk prediction models”) and a link to a David Spiegelhalter paper written in 1986, Probabilistic prediction in patient management and clinical trials (link). In that paper, David argued for “the provision of accurate and useful probabilistic assessments of future events” as a fundamental task for biostatisticians when collaborating in clinical or experimental medicine. Thirty-two years later, David is Winton Professor for the Public Understanding of Risk (link).

In 2011, David and his colleagues (Mike Pearson and Ian Short) discussed visualising uncertainty about the future (link). They describe probabilities best treated “as reasonable betting odds constructed from available knowledge and information”. They identified three key concepts that can be used for evaluating techniques to display probabilistic predictions:

  • Common sense and accompanying biases
  • Risk perception and the role of personality and numeracy
  • Type of graphic presentation used

David and his colleagues summarise their thoughts about visualising uncertainty in this box:

David returned to the theme of uncertainty in 2014 and suggested “it will be vital to understand and promote uncertainty through the appropriate use of statistical methods rooted in probability theory” (link). Much of David’s recent work has focussed on the communication of risk. At the Winton Centre:

All too often, numbers are used to try to bolster an argument or persuade people to make a decision one way or the other. We want to ensure that both risks and benefits of any decision are presented equally and fairly, and the numbers are made clear and put in an appropriate context. We are here ‘to inform and not persuade’. (Link)

All these thoughts were running through my head when I decided to contact Tony Corke. I admire his work immensely. A couple of rapid emails helped me with a Brier Score issue for priors and posteriors from my Women’s T20 cricket data. I am not blessed with mathematical intelligence and Tony was very reassuring.

I am now off to research Solomon Kullback and Richard Leibler who were writing a year after Glenn Brier’s 1950 paper (link).

Photo Credits

Foggy mountain by Eberhard Grossgasteiger on Unsplash

White clouds and blue sky by Paul Csogi on Unsplash

Sailing on the ocean by Andrew Neel on Unsplash

Making sense of data practices

Laura Ellis has been writing this week about solving business problems with data (link). The alert to her post came shortly after another link had taken me back to a presentation by Dan Weaving in 2017 on load monitoring in sport (link). A separate alert had drawn my attention to two Cassie Kozyrkov articles, one on hypotheses (link) and the second on what great data analysts do (link).

I have all these as tabs in my browser at the moment. They joined the tab holding David Snowden and Mary Boone’s (2007) discussion of a leader’s framework for decision-making (link).

These five connections make for fascinating reading. A good starting point, I think, is David and Mary’s visualisation that forms the reference point for the application of the Cynefin framework:

They observe “the Cynefin framework helps leaders determine the prevailing operative context so that they can make appropriate choices”.

The 2007 visualisation was modified in 2014 when ‘simple‘ became ‘obvious‘ (link). Disorder is in the centre of the diagram wherein there is no clarity about which of the other domains apply:

In a book chapter published in the year 2000 (link), David notes “the Cynefin model focuses on the location of knowledge in an organization using cultural and sense making …”. Laura, Dan and Cassie provide excellent examples of this sense-making in their own cultural contexts.

Many of my colleagues in sport will appreciate this slide from Dan’s presentation that exhorts us “to adopt a systematic process to reduce data by understanding the similarity and uniqueness of the multiple measures we collect”:

… whilst being very clear about the time constraints to share the outcomes of this process with coaches.

Photo Credit

Arboretum – Bonsai (Meg Rutherford, CC BY 2.0)

Sharing insights and decision-making experiences

Braidwood is my St Mary Mead and Lake Wobegon.

It is a place where I can ponder events way beyond this small rural New South Wales town and connect with them through events in the town.

This weekend, the Braidwood Festival has been helping me reflect on thoughts about insights and decision-making shared by Jacquie Tran.

Jacquie’s presentation, From insights to decisions: Knowledge sharing in sports analytics, has stimulated lots of interest and conversations. One of the observations Jacquie has made is:

Enter Braidwood into this conversation.

This weekend, the Festival of Braidwood has included an airing of quilts, an Art on the Farms exhibition, and open gardens. All of these have a synchronicity with Jacquie’s discussion. I have two examples from the weekend to illustrate the points Jacquie is making.

The first is from on of the exhibits, an upholstered chair by Heidi Horwood.

In the exhibition catalogue, Heidi writes:

The chair was found in a shed on a farm in Braidwood in a state of considerable disrepair. Many of the fabrics that make the patchwork in this project are very old and sourced in Braidwood. … I love the sense of history in old chairs and imagine the comfort they have brought.

The second is from a the Linden Garden at Jembaicumbene. The gardeners there have transformed the garden in five years. They have planted trees, herbaceous borders and found ways to manage limited resources in a windswept location.

The garden draws inspiration from landscape designer Nicole de Vesian, who at 70 translated her experience as a designer for Hermes to create her garden, La Louve in Provence.

I hope both examples add to the conversation Jacquie has started about insights. In both of them there is a bisociation occuring. Arthur Koestler said of bisociation “The discoveries of yesterday are the truisms of tomorrow, because we can add to our knowledge but cannot subtract from it.”

Having a sense of who we were and who we are gives us opportunities to consider how we will be. I see this a profoundly shared experience.

I wonder what you think.

Photo Credits

Braidwood (Jack Featherstone)

Jack Bourke shearing (Katie Lyons, Art on Farms)

Chair (Heidi Horwood)

Linden Garden (Braidwood Open Gardens)

Bedervale (Keith Lyons, CC BY 4.0)