## What’s Wrong with Probability Notation?

October 22nd, 2009 by joseSometimes I wonder why many humans (me included) have trouble understanding probability. In cognitive science, probabilistic models are taking over most areas. Still, most people struggle with them. Could it be that the notation is just hard to swallow? What’s Wrong with Probability Notation? is a magnificent post that gives some basic reasons:

The first two issues arise in the usual expression of the first step of Bayes’s rule,

,

where each of the four uses of corresponds to a different probability function! In computer science, we’re used to using names to distinguish functions. So and are the same function applied to different arguments. In probability notation, and are different probability functions, picked out by their arguments.

This is one clear communication problem. Ideally we want more people to follow probabilistic reasoning. Doctors, judges, etc all show significant struggles when given probabilities (see e.g., Helping Doctors and Patients Make Sense of Health Statistics).

But how do we tackle this problem? Changing notation is easier said than done. In fact, anyone departing from traditional notation will have to convince reviewers that his notation is better… and add to the risk of cause a less-than-ideal impression.

Any ideas?

*If you enjoyed this post, make sure you !*

October 22nd, 2009 at 5:01 pm

Fortunately there is an easy answer to your question. In a phrase, “natural frequencies”. Gerd Gigerenzer wrote a book about it and how to think about working through Bayesian problems very easily. I teach a large section critical thinking course (based in philosophy) at the University of Missouri and this is one of the things I teach students because after all, every one of them is going to need Bayes to interpret some sort of medical emergency.

Here’s a link to the book: https://academicproductivity.com/2009/whats-wrong-with-probability-notation/

October 30th, 2009 at 2:24 am

Actually, it’s more complicated than the quote might lead you to believe because x and y are random variables, which each come with their own probability distribution (a function). The p is more like an operator. For example p(x+y) means the distribution you get by adding x and y, which isn’t trivial to compute (it’s a convolution product).

There are easier ways to visualize Bayes’ rule, though. I use tree diagrams when I teach in, and students find it intuitive.

February 14th, 2010 at 12:01 pm

This is just standard shorthand probability notation. I.e.,

p(x) is really p(X=x) which is really p_X(X=x), so to do Bayes rule in this complete form we’d get

p_{X|Y}(X=x | Y=y) = p_{Y|X}(Y=y|X=x)p_X(X=x)/p_Y(Y=y)

as in all mathematics,when you use something a lot, you make a shorthand.