Donald Knuth, basketball and computers in sport

Introduction

Donald Knuth received a scholarship in 1956 to attend the Case Institute of Technology in Cleveland, Ohio. During his time at the Institute, Donald was manager of the basketball team. His interests in observation, notation and computer programming were recorded in an IBM documentary.

Synergy

At Case, Donald combined his interest in computers with his involvement in basketball. He had managed his High School basketball team and took on this role at Case during his undergraduate years (1956-1960).

The Computer History Museum‘s biography of Donald includes this paragraph:

Knuth’s lifelong love affair with computers began as an undergraduate when he discovered the IBM 650 computer system at Case. He quickly mastered the inner workings of the machine and developed a novel program to automate coaching of the school’s basketball team, earning him an appearance on the CBS Evening News with Walter Cronkite.

This novel program is reported in a variety of contexts … all of them I missed in my research into computers in sport.

Donald’s Basketball System

The Internet Archive has a copy of Donald’s chapter 23 in Selected Papers in Fun and Games. In it he wrote:

In high school I’d come up with a general rule of thumb that said,“ Possession of the ball is worth roughly one point, except near the end of a period. ” In other words, if you enter the stadium at a time when your team is leading by a score of 50~49, the effective score is really 51-49 if your team has the ball, but the game is basically tied if the other guys have possession. A corollary of this rule is that field goals don’t really change the effective score! One team gains 2 points, but loses possession, while the opponents gain possession. The score really changes when there’s a turnover, or when a free throw is made, or at the very end.

Of course I knew that this rule of thumb was only a rough approximation; maybe possession was worth only .8 of a point, say. Even so, the person who steals the ball should be rewarded more than the person who makes baskets, contrary to the normal way that players get credit for their contributions.

At Case, “I finally had an opportunity to test these hunches in a quantitative way, and the computer program I wrote was based on those informal notions about possession. To everyone ’s surprise, including my own, the system turned out to be quite successful.”

Chapter 23 provides a detailed account of Donald’s observation system. The records were kept in an “electronic computer”.

The statistics at each basketball game can be taken by two men: a recorder and a spotter. After the game it takes approximately 30 minutes to prepare the necessary totals from the game sheets and about three minutes to punch the IBM cards. One IBM card is made for each Case player who participated in the game, plus a card each for Case and the opponents. Then the machine takes 1.5 minutes to process the game: 30 seconds to take in the “program” of instructions for calculation, 30 seconds to take in the statistical data from all the previous games, and 30 seconds to take in the statistics from this game and to punch the answers. The computer punches four cards for each player, two indicating his performance in this particular game and two containing his cumulative record to date. The cards can easily be printed up for reference and can be filed neatly. Any desired set of statistics can quickly be found from them by passing them through a sorter.

Donald’s real-time notation recorded:

    • Field goals attempted and made (divided into short, medium, and long range).
    • Total free throws attempted and made.
    • Last free throws of a set, made and missed.
    • Total fouls and offensive fouls.
    • Rebounds, defensive and offensive.
    • Violations of rules causing loss of ball.
    • Assists.
    • Loss of ball by fumble, bad pass, or jump ball.
    • Gain of ball by interception or jump ball.
    • Defensive mistakes — allowing opponent to score a field goal.
    • Minutes played.

Donald recorded these data in real time on pre-printed observation sheets. He collected all eleven elements of these data for each Case player and the first six elements for the opposing team.

His program gave each Case player a personal score. Donald noted:

The computer calculates the “true point contribution” of a player by using a rather complicated formula. Using the abbreviations FGA (field goals attempted), FGM (field goals made), FTA (free throws attempted), FTM (free throws made), LFTI (last free throws made), LFTO (last free throws missed), TF (total fouls committed), OF (offensive fouls), OR (offensive rebounds), DR (defensive rebounds), VIOL (violations), AST (assists), FUM (fumbles), BP (bad passes), JL (jump balls lost), JG (jump balls gained), INT (interceptions), DM (defensive mistakes), the player’s “point contribution” rating is:

PC = 2FGM + FTM + 2(AST – DM)

– a(VI0L + FUM + JL + BP + AST + FGM + LFTI)

+ /3(INT + JG + OR + DR + DM – OF)

– 7 (TF) – S (FGA – FGM + LFTO),

where a, f3, 7 , and 5 are weighting coefficients determined by team totals. In the formulas for these coefficients, small letters indicate opponents’ totals and capital letters denote Case totals:

a = 2f gm/ (f ga + viol + of – or + INT + JG + TF – OF);

/3 = 2FGM/(FGA + VIOL + OF – OR + FUM + JL + BP + tf – of);

7 = (ftm – /3(lfti + If to x DR/ (or + DR))) /TF:

6 = a x dr/(0R + dr).

The Archive record includes a copy of Donald’s mimeographed form.

Donald said of his program as an afterthought:

Alas, I have been unable to find any copies of the original program, nor do I recall who carried on with it after I left for graduate school.

My formula for PC should not be taken too seriously. I kept fiddling with it, and never really believed that it was rigorously correct. This work was done long before I had ever heard of Markov processes.

I communicated details of this work to some people at Marquette University in the early 1960s. But it has almost surely had no influence on subsequent applications of computers to sports, except perhaps to stimulate others to do better.

By 1995, professional basketball teams were using computers routinely … (My emphasis)

Analysing Performance

Donald started his basketball analysis at Case fourteen years after Lloyd Messersmith’s dissertation was submitted for examination. In the intervening years there had been significant breakthroughs in computational methods.

A decade after Donald’s experiences at Case, another computer scientist Anatolij Zelentsov was exploring the use of computers in the analysis of association football. This was another iteration in the computerised notation and analysis of performance that provided a ‘functional readiness’ measure for each player coached by Valerij Lobanovs’kyj at Dynamo Kiev.

We are fortunate that we have Donald’s account of his work. You might find Chapter 11 of Donald’s interview on the Web of Stories of particular interest. In it he describes in detail his work at Case.

I found it fascinating to listen to the reflections of his early days in computing after a lifetime of engagement in computer science and the author of The Art of Computer Programming.

I am profoundly sorry it has taken me so long to write about his work. I hope it adds to our knowledge of our origins.