Robin Poke’s PhD Submission: A Narrative History of Australian Rowing

Robin Poke holding his PhD Submission

Robing Poke submitted his PhD thesis for examination today at the University of Canberra. It is the culmination of six years assiduous research that is titled A Narrative History of Australian Rowing 1770-2016.

I have been fortunate to be Robin’s primary supervisor.

I believe it to be a magnum opus in the history of rowing. It extends to two volumes and shares some remarkable primary sources to build the narrative.

The abstract is:

This thesis describes in detail the beginnings, development and progress of rowing in Australia through fifteen chapters that set out chronologically how the sport transitioned from the days of settlement, the early watermen, and to the 19th century and the onset of professional sculling. Then came, in the 20th century, the era of pure amateurism before, given the massive funding in contemporary sport, it reverted at the very least to the semi- professional level.

The initial chapters describe the early use of boats by settlers and the exploits of the earliest professional scullers, who captured the imagination not just of the citizens of New South Wales but of all the colonies. Then comes the rapid expansion of rowing and sculling at all levels: club, colonial and national, and the onset of the amateur ideology. The transition from inter-colonial to inter-state competition is described, as is the emergence of women’s rowing. Then comes Australia’s growing involvement at the international level between the two world wars. The retirement of professional sculler Bobby Pearce and the eventual decline of professional sculling are discussed.

A continuing swing away from amateurism towards at least semi-professionalism is seen. Also described is the improvement in the administration of national rowing, at the hands, initially, of John Coates, assisted by John Boultbee. Australia’s first professional Director of Coaching, Reinhold Batschi is introduced.

An extraordinary decade in the history of Australian rowing arrives, during which the sport experiences hitherto unforeseen success and at the end of which hosts an Olympic Regatta. At the heart of this success are the stunning results obtained by a crew that had become known as the Oarsome Foursome.

The period between the celebrating of a successful ‘home’ Olympic Games in 2000 and the London Olympic Games in 2012 is described. In the interim were the Athens 2004 and Beijing 2008 Games. The thesis ends with a discussion about Rowing Australia’s high performance plans for the future of rowing and contemplation about the process of writing a narrative history of rowing.

Robin at the Graduate Office at UC handing in his thesis

We await with great interest the external examiners’ responses in 2019.

Photo Credits

Robin Poke (Keith Lyons, CC BY 4.0)

Forecasting, predicting, classifying and uncertainty

This is a post to share my bumping into work by Glenn Brier, Frank Harrell and David Spiegelhalter. It coincides with an email exchange I had with Tony Corke about how to share posterior outcomes in the context of prior statements of probability.

I saw a reference to a Brier Score in David Glidden’s (2018) post on forecasting American football results (link) and followed up David’s reference to a Brier score in Wikipedia (link). This encouraged me to seek out some of Glenn Brier‘s papers to find the origin of the score named after him.

An early paper was written in 1944 for the United States Weather Bureau. It was titled Verification of a forecaster’s confidence and the use of probability statements in weather forecasting (link). In the introduction to that paper, Glenn observed “one of the factors that has contributed to the difficulties and controversies of forecast verification is the failure to distinguish carefully between the scientific and practical objectives of forecasting” (1944:1). He proposed that:

  • The value of forecasts can be enhanced by increased use of probability statements.
  • The verification problem can be simplified if forecasts are stated in terms of probabilities.

He added “there is an inherent danger in any forecast if the user does not make use of (or is not provided with) the pertinent information regarding the reliability of the forecast” (1944:7). The sharing of information about the reliability of the forecast makes it possible to provide recommendations for action. Glenn concluded “the forecaster’s duty ends with providing accurate and unbiased estimates of the probabilities of different weather situations” (1944:10).

In a paper written in 1950, Verification of forecasts expressed in terms of probability, Glenn provided more detail about his work on probability statements and presented the details of his verification formula (link). He proposed ” perfect forecasting is defined as correctly forecasting the event to occur with a probability of unity” with 100 percent confidence (1950:2).

The 1950 paper raised the question of skill in forecasting. A decade later, Herbert Appleman (1959) initiated a discussion about how to quantify the skill of a forecaster (link). Glenn’s 1950 paper prompted Allan Murphy (1973), amongst others, to look closely at vector partitions in probability scores (link). Some time later, Tilmann Gneiting and Adrian Raftery (2007) considered scoring rules, prediction and estimation (link).

Frank Harrell fits into this kind of conversation in a thought-provoking way. Earlier this year he wrote to distinguish between classification and prediction (link). He proposes:

Whether engaging in credit risk scoring, weather forecasting, climate forecasting, marketing, diagnosis a patient’s disease, or estimating a patient’s prognosis, I do not want to use a classification method. I want risk estimates with credible intervals or confidence intervals. My opinion is that machine learning classifiers are best used in mechanistic high signal:noise ratio situations, and that probability models should be used in most other situations.

He concludes:

One of the key elements in choosing a method is having a sensitive accuracy scoring rule with the correct statistical properties. Experts in machine classification seldom have the background to understand this enormously important issue, and choosing an improper accuracy score such as proportion classified correctly will result in a bogus model.

There is a reference in Frank’s paper to risk (“by not thinking probabilistically, machine learning advocates frequently utilize classifiers instead of using risk prediction models”) and a link to a David Spiegelhalter paper written in 1986, Probabilistic prediction in patient management and clinical trials (link). In that paper, David argued for “the provision of accurate and useful probabilistic assessments of future events” as a fundamental task for biostatisticians when collaborating in clinical or experimental medicine. Thirty-two years later, David is Winton Professor for the Public Understanding of Risk (link).

In 2011, David and his colleagues (Mike Pearson and Ian Short) discussed visualising uncertainty about the future (link). They describe probabilities best treated “as reasonable betting odds constructed from available knowledge and information”. They identified three key concepts that can be used for evaluating techniques to display probabilistic predictions:

  • Common sense and accompanying biases
  • Risk perception and the role of personality and numeracy
  • Type of graphic presentation used

David and his colleagues summarise their thoughts about visualising uncertainty in this box:

David returned to the theme of uncertainty in 2014 and suggested “it will be vital to understand and promote uncertainty through the appropriate use of statistical methods rooted in probability theory” (link). Much of David’s recent work has focussed on the communication of risk. At the Winton Centre:

All too often, numbers are used to try to bolster an argument or persuade people to make a decision one way or the other. We want to ensure that both risks and benefits of any decision are presented equally and fairly, and the numbers are made clear and put in an appropriate context. We are here ‘to inform and not persuade’. (Link)

All these thoughts were running through my head when I decided to contact Tony Corke. I admire his work immensely. A couple of rapid emails helped me with a Brier Score issue for priors and posteriors from my Women’s T20 cricket data. I am not blessed with mathematical intelligence and Tony was very reassuring.

I am now off to research Solomon Kullback and Richard Leibler who were writing a year after Glenn Brier’s 1950 paper (link).

Photo Credits

Foggy mountain by Eberhard Grossgasteiger on Unsplash

White clouds and blue sky by Paul Csogi on Unsplash

Sailing on the ocean by Andrew Neel on Unsplash

Making sense of data practices

Laura Ellis has been writing this week about solving business problems with data (link). The alert to her post came shortly after another link had taken me back to a presentation by Dan Weaving in 2017 on load monitoring in sport (link). A separate alert had drawn my attention to two Cassie Kozyrkov articles, one on hypotheses (link) and the second on what great data analysts do (link).

I have all these as tabs in my browser at the moment. They joined the tab holding David Snowden and Mary Boone’s (2007) discussion of a leader’s framework for decision-making (link).

These five connections make for fascinating reading. A good starting point, I think, is David and Mary’s visualisation that forms the reference point for the application of the Cynefin framework:

They observe “the Cynefin framework helps leaders determine the prevailing operative context so that they can make appropriate choices”.

The 2007 visualisation was modified in 2014 when ‘simple‘ became ‘obvious‘ (link). Disorder is in the centre of the diagram wherein there is no clarity about which of the other domains apply:

In a book chapter published in the year 2000 (link), David notes “the Cynefin model focuses on the location of knowledge in an organization using cultural and sense making …”. Laura, Dan and Cassie provide excellent examples of this sense-making in their own cultural contexts.

Many of my colleagues in sport will appreciate this slide from Dan’s presentation that exhorts us “to adopt a systematic process to reduce data by understanding the similarity and uniqueness of the multiple measures we collect”:

… whilst being very clear about the time constraints to share the outcomes of this process with coaches.

Photo Credit

Arboretum – Bonsai (Meg Rutherford, CC BY 2.0)