Okay. For now, this is how this blog will work. With my imperfect knowledge of materials, honestly, writing an organized teaching material will be difficult; otherwise I’ll be writing a book ðŸ˜› Instead, I’ll be posting small bit by bit of what comes up in my mind, and when these small bits become dense enough to cover a solid topic, then probably I will turn that into a proper page with in an organized format. Until then, here it goes.

**Bayes rule**. In a mathematical expression, or formula, it simply states:

Here, represents a probability density function of an event happening given an event ; we’ll talk about probability distributions in future posts. Or if it helps, you can take it in terms of probabilities as well:

where represents the *probability* of event happening given an event .

Bayes rule (or Bayes theorem) is widely used in probabilistic robotics literature and algorithms. Why is that? Because it lets you know more about without directly measuring this, or more often you don’t have a way to measure this. Bayes rule lets you compute this as long as you know and . Imagine you have a variable in which state you cannot have a direct measurement, but you know how behaves when event has occurred. In such case, however, Then, Bayes rule states that with only *a priori* knowledge of and the conditional probability of given , you can gain knowledge about the posterior information of .

I know these are a lot to digest, and it’s not easy to feel how this helps. In my posts, I’ll try to use many examples as I can so that it’s easier to see its applications.

#### example) An apple in a paper bag

Imagine a paper bag on a table with an apple in it. The apple can be a red Gala (R) or a green Fuji (G); yes, you only have two choices here. You want to know which kind of the two apples is in the bag.

Because the paper bag is not transparent, you cannot see the type of the apple. Then, what are the chance of that apple being a red Gala? Since you haven’t touched it and can’t see anything, let’s say we’re 50/50.

You start thinking of how to tell which apple it is. You cannot open the bag, but you’re free to lift the bag. And, you happen to know that Fuji apples are generally heavier than Gala apples! And once you lift the bag, you will feel its weight! So can we use that information to know which apple is more likely in the bag?

Yes, you can. You probably won’t know it for sure, but you can be somewhat confident with your answer.

So you go to the table and lift the bag up. You then realize that the paper bag does feel a bit heavier than you would for a bag with a Gala apple. Hmm. Okay – it feels like a Fuji apple, but you’re not too sure how precise your feeling is; maybe you were holding it for too long and your arm got tired. Let’s say you’re 60% sure that it’s heavier than a Gala.

Good. We have all the information to use Bayes rule now. Let’s start with setting up some variables. What we want to know is the probability of the type of apple () being G given the heaviness you feel. Let’s denote the your measurement (your feeling) of heaviness as .

Again, there are two types of apples: and your measurements are . And we want to know . You could compute instead, and it will be equal to .

Applying Bayes rule, the above equation becomes:

Let’s see what these terms mean: means the probability of you feeling *heavy* when the apple is G; this is something you do have an idea. Previously we mentioned that you’re about 60% sure; so the value is 0.6. The other term, is your a *priori* information about the apple being G. Since we’re 50/50 sure, the value becomes 0.5.

What about ? This term is asking what the probability of you feeling *heavy* in general. This is an odd thing. Without knowing (or given) the type of apple in the bag, how would you know how likely you’ll feel *heavy*? Given our limited choices, however, we can actually compute this. We’ll use *law of total probability* which states:

Note that is for computing *marginal** distribution*. If you have two (random) variables A and B, you can compute a marginal distribution of one by integrating out the other variable, which is what that summation (or integration in continuous space) in above equation is.

Applying this to our apple problem,

The term because . For the other term, , we’ll need to set a value for this term. You can assign the same value as , but they fundamentally have a different meaning. This term asks what’s the likelihood of you feeling *heavy *when the apple is R. Maybe your sense of heaviness completely changes lifting a Gala (R) apple, or you’ve never seen one and have no clue of their weights. Let’s say you’ve never seen a Gala apple, thus have no clue. This will make you feeling *heavy* given a Gala apple, you’re equally unsure whether it’s *heavy* or *light — *thus

Now we have all the pieces and can carry out the calculation.

So you’re 66.67% sure that the apple is a green Fuji (G) given your feeling of heaviness!