Summary

6.10. Summary#

Key Take-Aways

Conditional Probabilities

The mathematical notation \(P(B|A)\) denotes the conditional probability of the event \(B\) given that the event \(A\) occurred and (provided \(P(A)>0\)) is defined as

\[ P(B|A) = \frac{ P\left( A \cap B \right)}{ P\left(A \right)}. \]

The conditioning bar defines a new conditional probability measure on the original probability space.
There can be only one conditioning bar in a conditional probability. To express the probability of \(A\) given that \(B\) occurred and \(C\) occurred, the correct notation is \(P(A | B \cap C)\).
Compared to an unconditioned probability for an event, conditional on some other event can result in a lower, higher, or the same probability.
Conditional probabilities that have different conditioning events are generally never added together. In particular,

\[ \mbox{THIS IS NOT USUALLY TRUE: }P\left( A \vert B \right) + P \left( A \vert \overline{B} \right) = 1. \]

Given any event \(B\) with nonzero probability,

\[ \mbox{This is true: } P\left( A \vert B \right) + P \left( \overline{A} \vert B \right) = 1 \]

because we are asking about the sum of the probabilities of a partition using a conditional probability measure.

Simulating Conditional Probabilities

Consider designing a simulation to estimate some conditional probability \(P(A|B)\). The usual approach is to use two counters. One counts the number of times the conditioning event occurs, so \(N_B\) counts the number of occurrences of event \(B\). A second counter, \(N_{AB}\) counts the number of times \(A\) occurs when \(B\) has occurred. The first counter is updated within an if statement that checks that \(B\) has occurred. Within that if statement, a nested if statement checks whether \(A\) has occurred and increments the counter \(N_{AB}\) if it has.
When simulating conditional probabilities, the number of simulation iterations cannot be determined from \(P(A|B)\) but instead must be determined from \(P(A \cap B)\). An alternative is to simulate until each counter detects a sufficient number of events (typically \(>100\)).

Statistical Independence

Events \(A\) and \(B\) are statistically independent (s.i.) if and only if

\[ P\left( A \cap B \right) = P \left(A \right) P \left( B \right). \]

Events can be assumed to be s.i. when they arise from completely separate random phenomena.
If \(A\) and \(B\) are statistically independent and have nonzero probabilities, then \(P(A|B) = P(A)\) and \(P(B|A) = P(B)\). Since they are independent, the probability of an event does not change given information that the other event occurred.
For multiple events to be s.i., the probability of the intersection of any \(K\) of the events must factor as the product of the individual probabilities.
For multiple events \(A_0, A_1, \ldots A_{N-1}\), a weaker notion of independence is pairwise statistical independence, in which \(P\left(A_i \cap A_j \right) = P \left(A_i \right) P \left( A_j \right)\) for every \(i \ne j\).
If \(A\) and \(B\) are mutually exclusive and have nonzero probabilities, then \(P(A|B) = P(B|A) =0\).
Events cannot be both s.i. and m.e. unless one or both of the events have probability zero.

Conditional Independence

Events \(A\) and \(B\) are conditionally independent given an event \(C\) if and only if \(P\left( A \cap B \vert C \right) = P\left( A \vert C \right) P\left( B \vert C \right) \).
Events that are dependent may be conditionally independent given an appropriate conditioning event.
Events that are independent may be conditionally dependent given an appropriate conditioning event.

Chain Rules

Chain rules are used to factor the probability of an intersection of events in terms of a product of conditional probabilities.
Chain rules are usually used to rewrite complicated probabilities in terms of probabilities that are known or easier to determine.
The simplest chain rules are

\[ P\left( A \cap B \right) = P \left( A \vert B \right) P\left( B \right) \]

and

\[ P\left( A \cap B \right) = P \left( B \vert A \right) P\left( A \right). \]

Chain rules can be found for any number of events.

Total Probability

The Law of Total Probability says that if a set of events \(A_0, A_1, \ldots\) is a partition of the sample space \(S\), then any event \(B\) can be written as \begin{align*} P\left( B \right) &= \sum_i P\left(B \cap A_i \right) \ &= \sum_i P\left(B \vert A_i\right) P\left( A_i \right). \ \end{align*}
Total probability is a powerful tool for calculating the probability of an event when that event depends on some other events but we do not know whether those other events occurred.

Mental Models

Our intuition may fool us when considering whether events are independent. For instance, in the Magician’s Coin problem, many people initially think that getting heads on the second flip of the coin is independent of the event that the magician got heads on the first flip of the coin. One way to avoid this intuition failure is to imagine a much more extreme scenario. For this example, instead of thinking about getting heads on just one flip of the coin, imagine that the first one million flips were heads. Then the chosen coin must almost surely be the two-headed coin, and the probability of getting heads on the next flip should be very close to 1.

Self Assessment Exercises

Answer the questions below to assess your understanding of the material in this chapter. Because there are many questions for this chapter, a random sample of 10 questions are shown. Reload this page to get a new random sample of 10 questions.

Terminology Review

Use the flashcards below to help you review the terminology introduced in this chapter. \(~~~~ ~~~~ ~~~~\)

Spaced Repetition

Use these questions to review material from previous chapter: