In my previous post, I briefly talked about probability density functions. I’d like to discuss more about this today. Probability ddistribution functions appear a lot of times in robotics literature; because all our measurements and knowledge are not perfect. You’ll see a number of many probability distributions. I won’t be possibly cover all different distributions you’ll see, but would like to give a high-level understanding of what these are.

Probability distribution functions describe how probabilities are distributed over a random variable, say x. Depending on whether this random variable is in continuous or discrete, we’ll be using different functions; *probability density functions* and *probability mass functions*.

**Probability Density Function**

If the random variable, x, is continuous, one can use *probability density function (pdf).
*\begin{aligned}p(x) &>= 0 \quad \text{and} \int_{-\infty}^{\infty} p(x) dx = 1 \end{aligned}

A pdf can take any arbitrary shape as long as above properties are met. Let’s look at few examples.

A probability density function which looks like a step function for [-0.5, 0.5]. Zero everywhere else including (-∞, -0.5] and [0.5, ∞)

The area under the curve denoted in red color integrates to 1.0 which satisfies the properties of probability density function.

This Gaussian or normal distribution is another example. This graph shows a pdf for a normal distribution with mean at zero and 0.2 standard deviation — \mathcal{N}(x; 0, 0.2^2). Note that (-\infty, -1.5] and [1.5, \infty) regions have non-zero, but insignificant values. Note that normal distributions have properties of having 0.997 probability under three standard deviation range; \mathrm{Pr}(-0.6 \le x \le 0.6) \approx 0.997 .

What about this pdf on the left? Again assume all the regions not shown ((-\infty, -2.0] and [2.0, \infty)) have pdf values of zero. It has many peaks here and there. I’m unclear how I can represent this density function in a closed-form or whether it’s even possible.

This pdf does satisfy the conditions mentioned above. All values are non-zero and they do sum up to 1.0. It is important to remember that not all pdf can be mathematically expressed easily. They can (hopefully) be expressed in a closed-form such as marginal distribution of products of different pdfs, which is a very lengthy equation. But sometimes you just have to work with an arbitrary pdf in which case you’ll have to do numerical computations.

**Probability Mass Function**

If the random variable is discrete, one can use *probability mass function.*

If the discrete random variable is exactly equal to certain values, there will be non-zero probabilities. Other than those certain values, probability is zero.

An example will be a (fair) coin toss: there are only two discrete options: {head, tail}. If we denote set a random variable to be 1 if head, and 0 if tail, we’ll have a pmf where p(0) = 0.5, p(1)=0.5 and all other values are zero.

Here is another example of pmf. Following probabilities are assigned:

\begin{aligned}p(-1.46) &= 0.25 \\p(-1.1) &= 0.35 \\p(-0.64) &= 0.05 \\p(0.02) &= 0.15 \\p(0.90) &= 0.10 \\p(1.6) &= 0.10\end{aligned}

When all probabilities are added, it becomes 1.0. All probabilities for values other than \{-1.46, -1.10, -0.64, 0.02, 0.90, 1.60\} are zero.

Another popular pmf example is Bernoulli distribution.

\mathrm{Ber}(p) = \begin{Bmatrix}p, & \text{if }x\text{ is 1} \\ 1-p, & \text{if }x\text{ is 0} \end{Bmatrix} You can consider Bernoulli distribution as a weighed coin. Different to the fair coin example above, one side has higher probability of showing up (for example, 0.7 probability of showing head). Fair coin will be a special case of Bernoulli distribution of \mathrm{Ber}(0.5).

I hope this information is helpful, and help you through understanding those cutting-edge algorithms written in probabilities 🙂