Probability and Sensemaking

12/31/2011

This post evolved out of two links I received yesterday.

The first came from Stephen Downes’ discussion of evidence-based decision-making. The second from an IBM post The Next 5 in 5.

Both links extended my interest in a probabilistic approach to decision-making.

Stephen linked to S L Zabell’s discussion of Rudolf Carnap‘s Logic of Inductive Inference. I noted Zabell’s observation that:

Carnap’s interest in logical probability was primarily as a tool, a tool to be used in understanding the quantitative confirmation of an hypothesis based on evidence and, more generally, in rational decision making.

The paper is a detailed mathematical account of Carnap’s work and is way beyond my allocation of logical-mathematical intelligence. However I did note:

Carnap recognized the limited utility of the inductive inferences that the continuum of inductive methods provided, and sought to extend his analysis to the case of analogical inductive inference: an observation of a given type makes more probable not merely observations of the exact same type but also observations of a “similar” type. The challenge lies both in making precise the meaning of “similar”, and in being able to then derive the corresponding continua (p.294).

in the application of inductive logic to a given knowledge situation, the total evidence available must be taken as basis for determining the degree of confirmation (p.298).

Carnap advanced his view of “the problem of probability’. Noting a “bewildering multiplicity” of theories that had been advanced over the course of more than two and a half centuries, Carnap suggested one had to carefully steer between the Scylla and Charybdis of assuming either too few or too many underlying explicanda, and settled on just two. These two underlying concepts Carnap called probability1and probability2: degree of confirmation versus relative frequency in the long run (p.300).

Zabell concludes that “Carnap’s most lasting influence was more subtle but also more important: he largely shaped the way current philosophy views the nature and role of probability, in particular its widespread acceptance of the Bayesian paradigm” (p.305)

Stephen Downes cited Zabell in a discussion of  an evidence-based approach to demonstrating value in the future of libraries. In this discussion Brian Kelly argues that:

We do need to continue to gather evidence of the value of services, and not just library services.  But we need to understand that the evidence will not necessarily justify a continuation of established approaches to providing services.  And if evidence is found which supports the view that libraries will be extinct by 2020 then the implications need to be openly and widely discussed.

I picked up on Brian’s assertion that “the evidence will not necessarily justify a continuation of established approaches to providing services” and pondered the implications of this for my use of probabilistic approaches to sport performance. Rudolf Carnap’s work via S.L. Zabell’s synthesis has pushed me further to consider how much evidence I need to have to support an approach to performance in sport that can describe, predict, model and transform. I am going to follow up on Brian Skyrms and Jack Good too particularly in relation to ‘total evidence’!

By coincidence just as I was coming up for air after an early morning visit to induction and evidence I received an Economist alert to IBM’s 5 in 5 predictions (At the end of each year, IBM examines market and societal trends expected to transform our lives, as well as emerging technologies from IBM’s global labs, to develop a multi-year forecast).

The fifth item in the list is Big Data & sensemaking engines start feeling like best friends. Jeff Jonas writes “Here at IBM we are working on sensemaking technologies where the data finds the data, and relevance finds you. Drawing on data points you approve (your calendar, routes you drive, etc.), predictions from such technology will seem ingenious.”

He adds that:

This new era of Big Data is going to materially change what is possible in terms of prediction. Much like the difference between a big pile of puzzle pieces versus, the pieces already in some state of assembly – the latter required to reveal pictures. This is information in context, and while some pieces may be missing; some may be duplicates; others have errors; and a few are professionally fabricated lies – nonetheless, what can be known only emerges as the pieces come together (data finding data).

Earlier in the year Jeff wrote about massively scalable sensemaking analytics. His post has links to his other writing in this area:

Sensemaking Systems Must be Expert Counting Systems, Data Finds Data, Context Accumulation, Sequence Neutrality and Information Colocation to new techniques to harness the Big Data/New Physics phenomenon.

Jeff’s G2 system (Privacy by Design (PbD) has seven features: Full Attribution; Data Tethering; Analytics in the Anonymized Data Space; Tamper-Resistant Audit Logs; False Negative Favoring Methods; Self-Correcting False Positives; Information Transfer Accounting.

In 2010 Jeff discussed sensemaking and observed that:

Man continues to chase the notion that systems should be capable of digesting daunting volumes of data and making sufficient sense of this data such that novel, specific, and accurate insight can be derived without direct human involvement.

I found his explanation of the role of expert counting in sensemaking very helpful. I am disappointed that I have not accessed Jeff’s work until now.

It is fascinating that two early morning links can open up such a rich vein of discovery. At the moment I am particularly interested in how records can be used to inform decision making and what constitutes necessary and sufficient evidence to transform performance.

I have a lot of New Year reading to do!

Photo Credits

What’s That? (97)

Sensemaking

Computer Wire Art