Joint and conditional probability

Suppose that outcome can be either of events A or B (but never both) with probabilities 0.4 and 0.6 correspondingly in case event X happens. If mutually disjoint to X, event Y occurs instead then probabilites of A-B distribute evenly, like .5 and .5. These data can be summarized in a Markov matrix:

$$\begin{matrix} & X & Y \\ A & P(A|X) & P(A|Y) \\ B & P(B|X) & P(B|Y)\end{matrix}\quad = \quad\begin{matrix} & X & Y \\ A & .4 & .5 \\ B & .6 & .5\end{matrix}$$

Here, P(A|X) stands for probability of event A provided that X has occurred. A|X generally denotes a conditional probability of event A under condition X.

Note that the sum of columns adds up to 1 since their entries represent mutually exclusive events.

Now, suppose that X can occur with probability .8 and Y has probability of .2. We multiply the first unit/column with .8 and second with .2 so that the joint distribution breaks down into


 * 1 * .8 + 1 * .2 = 1 * (.8 +.2) = 1 * 1 = 1 = 1 * .8 + 1 * .2 = (.4 + .6) * .8 + (.5+.5)/5 = (.32 + .48) + (.1 + .1)

where first parenthesis is a sum of event probabilities under X and .1 + .1 are probabilities under event Y

This can be represented again by matrix again

$$\begin{matrix} & X|_{P(X)} & Y|_{P(Y)} \\ A & P(A\land X) & P(A\land Y) \\ B & P(B\land X) & P(B\land Y)\end{matrix} \quad = \quad \begin{matrix} & X|_{P(X)} & Y|_{P(Y)} \\ A & P(X\land A) & P(Y\land A) \\ B & P(X\land B) & P(Y\land B)\end{matrix} \quad  = \quad \begin{matrix} & X|_{.8} & Y|_{.2} \\ A & .32 & .1 \\ B & .48 & .1\end{matrix}$$

Note that columns now add up to .8 and .2 correpsondingly whereas all table adds up to .8+.2 = 1. We have got a 2-dimensional distribution of probability. In every cell we have the joint probability of pair of events occurring, e.g. P(A∩X) = .32. The probability of conjunction A∩X is less than the probability of components (A|X) and X alone because probability under X, probability of every column, added up to 1 in the conditional probability table but it adds to P(X) ≤ 1 in the joint distribution table.

This fact, that column $$P(A_1 \land X_i) + P(A_2 \land X_i) + \ldots = P(X_i)$$ adds up to marginal probability of the column Xi, that is a probability that randomly drawn event ends up in the column i, enables us to recover the conditional probabilities. We just need to divide every $$P(A_j \land X_i)$$ in the column i by P(Xi):


 * $$\begin{bmatrix}P(A|X) \\ P(B|X) \end{bmatrix} = \begin{bmatrix}P(A \land X) \\ P(B \land X) \end{bmatrix} {1 \over P(X)} = \begin{bmatrix}.32 \\ .48 \end{bmatrix} {1 \over .8} = \begin{bmatrix}.4 \\ .6 \end{bmatrix}$$

The relationship
 * $$ P(A\land X) = P(X) \cdot P(A|X)$$

is a basis for famous Bayes' theorem $$ P(X) \cdot P(A|X) = P(A) \cdot P(X|A)$$ because we can symmetrically condition the probabilities within the rows by probabilities of observing the rows:


 * $$\begin{bmatrix}(.32 + .1)/p_a \\(.48 + .1)/p_b\end{bmatrix} = \begin{bmatrix}1\\1\end{bmatrix} = \begin{bmatrix}(.32 + .1)/.42 \\(.48 + .1)/.58\end{bmatrix} = \begin{bmatrix}.76 + .24 \\ .83 + .17\end{bmatrix} = \begin{bmatrix}P(X|A) + P(Y|A)  \\ P(X|B) + P(Y|B)\end{bmatrix}$$

That is, conditional probability P(X|A) = .76.