User talk:Egm6936.f10/Probability concepts

Put your questions here.

= discussion 2011.09.22 =

Question 5
Xiu 2010, p.16. In order to generate non-uniform random variables, the most straightforward approach is via the inversion of the distribution function, for example, we can directly obtain the exponential random variable by inverting the exponential distribution. Given the exponential distribution density function is

$$\displaystyle f_{X}(x)=ae^{-ax}, a> 0$$

It's easy to get the distribution function as

$$\displaystyle F_{X}(x)=\int_{0}^{x }f_{X}(y)dy=\int_{0}^{x }ae^{-ay}dy=1-e^{-ax}$$.

Hence we have the exponential random variable by inverting the distribution function as

$$\displaystyle F_{X}^{-1}(u)=-\frac{log(1-x)}{a}$$

But for most cases that $$\displaystyle F_{X}^{-1} $$ isn't available in explicit form, which we should use approximation instead, like the normal distribution.

My question is can you explain in details how can we use approximation method to generate those random variables?

= discussion on 2011.09.15 =

Question 2
By your definition of the $$\sigma\,\;$$-field, $$\mathcal F $$, it's the collection of all subsets on $$\Omega$$. But from Xiu 2010, p.10, the $$\sigma\,\;$$-field, $$\mathcal F $$, is defined as the collection of subsets on $$\Omega$$. So, which one is better?

Answer 2
Clearly the definition of a $$\sigma\,\;$$-field $$\mathcal F $$ that includes all subsets of $$\Omega$$ is a stronger statement than the definition that only includes a collection of subsets of $$\Omega$$. The definition that is weaker is more general, i.e., the one that only includes a collection of subsets in $$\Omega$$. We need to look up some references to verify, e.g., some mathematical statistics / probability books.

Consider a counter example to the statement that we only need a collection of subsets of $$\Omega$$ to form a sigma-field: $$\Omega = {1, 2, 3}$$ $$\mathcal F :={\emptyset, 1, 2, \Omega}$$ $${1}\cup {2}={1, 2} \notin \mathcal F$$ Clearly, $$\mathcal F$$ can't be a sigma-field. The point here is that you can't take any arbitrary collection of subsets of $$\Omega$$ to form a sigma-field, but you need to take a collection of subsets of $$\Omega$$ that satisfies 3 conditions for the set $$\mathcal F$$ to be a sigma-field.

Three conditions that the sigma-field must satisfy:

$$\bullet$$ Not empty:$$\Omega \in \mathcal F$$ and $$\emptyset \in \mathcal F$$;

$$\bullet$$ Given $$A \in \mathcal F$$, then $$A^c \in \mathcal F$$;

$$\bullet$$ Given $$A_1$$, $$A_2$$,...$$\in \mathcal F$$, then

$$\bigcup_{i=1}^{\infty} A_{i} \in \mathcal F$$ and $$\bigcap_{i=1}^{\infty} A_{i} \in \mathcal F$$.

Question 3
For continuous distribution, the probability of any particular value is zero, $$\displaystyle P(X=x)=0, \forall x \in \mathbb R$$. Mathematically, I can understand this concept, that's since the distribution is continuous, we have

$$\displaystyle \lim_{\epsilon \rightarrow 0}F_{X}(x+\epsilon )=F_{X}{(x)}, \forall x$$.

But intuitively, I can't fully accept it. Why the probability of any particular value is zero?

Answer 3
For discrete events, like throwing a dice with 6 facets, you can talk about the probability of having, say, number 3, and that's 1/6.

But for continuous events, you need to think in terms of the area under the curve, similar to using the trapezoidal rule to integrate.

A better explanation would be that there are infinitely many real numbers; the set of real numbers, unlike the set of integers, is very dense. For example, given any two numbers that are very close to each other by $$\displaystyle \epsilon$$, with $$\displaystyle \epsilon$$ being very small (as small as you can imagine, e.g., $$\displaystyle 10^{-36}$$), you can always find infinitely many real numbers in between these two numbers.

So for continous events described by real numbers, the probability of throwing a "generalized dice" to get a particular real number is zero !

Question 4
Xiu 2010, p.15 : In order to generate nonuniform random variables, the inversion of a distribution function, i.g.,Inverse transform sampling is the most straightforward method. Assume the cdf of $$\displaystyle X$$ is $$\displaystyle F_{X}(x) = P(X < x)$$(Note: In Xiu 2010, p.15, $$\displaystyle F_{X}(x) = P(X \leqslant x)$$ is used here). For strictly increasing and continuous $$\displaystyle F_{X}(x)$$, we can solve $$\displaystyle F_{X}(x) = u$$ to get the unique root as $$\displaystyle x = F_{X}^{-1}(u)$$, $$\displaystyle 0 < u < 1$$. For distributions with discontinuity, $$\displaystyle F_{X}$$ is not strictly increasing, and we need be more care to find its inverse. For the left-continuous version
 * {| style="width:100%" border="0"

$$\displaystyle F_{X}^{-1}(u) := \text{inf} \{x\mid F_{X}(x) > u, 0 < u < 1 \}. $$ (2.7)
 * style="width:95%" |
 * style="width:95%" |
 * 
 * }

Note:

In Xiu 2010, p.15, Xiu has used $$\displaystyle F_{X}^{-1}(u) := \text{inf} \{x\mid F_{X}(x) \geqslant u, 0 < u < 1 \}$$. Actually, this is the definition for the inverse of right continuous cdf.

Following Proposition can help justifying the inversion method for generating nonuniform random numbers.

Proposition 2.11:

Let $$\displaystyle F_{X}(x)=P(X < x)$$ be the cdf of $$X$$. Then we have following results.

$$\displaystyle \bullet$$ $$\displaystyle u\leqslant F_{X}(x)\Leftrightarrow F_{X}^{-1}(u)\leqslant x$$, i.e., $$\displaystyle F_{X}^{-1}$$ is non-decreasing.

$$\displaystyle \bullet$$ If $$\displaystyle U$$ is uniform in $$\displaystyle \left ( 0,1 \right )$$, then $$\displaystyle F_{X}^{-1}(U)$$ has distribution function $$\displaystyle F_{X}$$.

$$\displaystyle \bullet$$ If $$\displaystyle F_{X}$$ is continuous, then $$\displaystyle F_{X}(x)$$ is uniform in $$\displaystyle \left ( 0,1 \right )$$.

Can you explain above proposition?

Answer 4
Fig 4.1: Continuous uniform distribution (pdf), U(0,1) Fig 4.2: Continuous uniform distribution (cdf), U(0,1) Fig 4.3: Distribution Function and Its Inverse Fig 4.4: Left Continuous Distribution Function and Its Inverse

Install inkscape to plot the figures that I drew in the above [[media:Hailong.2011_09_15_10_31_23.pdf |Explanation]]. See Writing tools. Discussion:

Function (mathematics)
A function is a mapping from set $$\mathbf A $$ to set $$\mathbf B $$, and satisfies:

1. For every element a in $$\mathbf A $$, there is an element b in $$\mathbf B $$ such that $$\langle a, b\rangle$$ is in the mapping; 2. If $$\langle a, b\rangle$$ and $$\langle a, c\rangle$$ are in the mapping, then b = c.

One-to-One: A function $$\displaystyle f$$ is said to be one-to-one (injective), iif whenever $$\displaystyle f(x) = f(y),\,x = y$$. Onto: A function $$\displaystyle f$$ from a set $$\mathbf A $$ to a set $$\mathbf B $$ is said to be onto(surjective), iif for every element $$\displaystyle y$$ of $$\mathbf B $$ , there is an element $$\displaystyle x$$ in $$\mathbf A $$ such that $$\displaystyle f(x) = y $$,  that is,  $$\displaystyle f $$ is onto iif $$\displaystyle f( A ) = B $$. Wikipedia on Function, Other resources on Function.

Fig 4.5: Function

Continuity
Definition:


 * {| style="width:100%" border="0"

$$\displaystyle \forall \epsilon > 0 $$, $$\displaystyle \exists \delta > 0 $$, such that $$\displaystyle \left | x - c \right | \leqslant \delta \Rightarrow \left | F(x) - F(c) \right | \leqslant \epsilon $$ (eq1)
 * style="width:95%" |
 * style="width:95%" |
 * 
 * }

Where $$\displaystyle c $$ is within the domain of $$\displaystyle F$$.

One-sided continuity
Left Continuity at a Point

We say that $$\displaystyle F$$ is continuous from the left at $$\displaystyle \hat{x}$$ when the limit from the left of $$\displaystyle F \left({x}\right)$$ as $$\displaystyle x \to \hat{x}$$ exists and:


 * {| style="width:100%" border="0"

$$\displaystyle \lim_{\underset{x \in A}{x \to \hat{x}^-}} F \left({x}\right) = F \left({\hat{x}}\right) $$ (eq2) Fig 4.6: Left continuity
 * style="width:95%" |
 * style="width:95%" |
 * 
 * }

Right Continuity at a Point

We say that $$\displaystyle F$$ is continuous from the right at $$\displaystyle \hat{x}$$ when the limit from the right of $$\displaystyle F \left({x}\right)$$ as $$\displaystyle x \to \hat{x}$$ exists and:


 * {| style="width:100%" border="0"

$$\displaystyle \lim_{\underset{x \in A}{x \to \hat{x}^+}} F \left({x}\right) = F \left({\hat{x}}\right) $$ (eq3) Fig 4.7: Right continuity
 * style="width:95%" |
 * style="width:95%" |
 * 
 * }

Where $$\displaystyle A $$ is the domain of function $$\displaystyle F(x)$$.

Minimum Vs. Infimum, Maximum Vs. Supremum
Let's look at some examples first.

Examples:

$$\displaystyle \text{Inf}\, \{ x: x \in (-1, 1)\} = -1$$

$$\displaystyle \text{Sup}\, \{ x: x \in (-1, 1)\} = 1$$

$$\displaystyle \text{Min}\, \{ x: x \in (-1, 1)\} \text{non-exist}$$

$$\displaystyle \text{Max}\, \{ x: x \in (-1, 1)\} \text{non-exist}$$ $$\displaystyle \text{Inf}\, \{ x: x \in [-1, 1]\} = -1$$

$$\displaystyle \text{Sup}\, \{ x: x \in [-1, 1]\} = 1$$

$$\displaystyle \text{Min}\, \{ x: x \in [-1, 1]\} = -1$$

$$\displaystyle \text{Max}\, \{ x: x \in [-1, 1]\} = 1$$

Definition:

Infimum of $$\mathbf A $$ is the greatest real number that smaller than or equal to every number in $$\mathbf A$$. If such number is non-exist (because $$\mathbf A$$ is not bounded below), then we define $$\text{inf}( \mathbf A ) = -\infty$$. If $$\mathbf A$$ is empty set, we define $$\text{inf}( \mathbf A ) = \infty$$.

Supremum of $$\mathbf A $$ is the smallest real number that greater than or equal to every number in $$\mathbf A$$. If such number is non-exist (because $$\mathbf A$$ is not bounded above), then we define $$\text{inf}( \mathbf A ) = -\infty$$. If $$\mathbf A$$ is empty set, we define $$\text{inf}( \mathbf A ) = \infty$$.

Also see infimum and supremum for real numbers, Wikipedia on Infimum, Wikipedia on Supremum.

Case A:



Fig 4.8: Strictly increasing, continuous (Case A)

$$\displaystyle \text{min}\left \{ x\mid f(x) > \hat u \right \} := \underset{ x > \hat x}{\text{argmin}} (f(x) > \hat u)$$ doesn't exist. $$\displaystyle \hat x = \text{inf}\left \{ x\mid f(x) > \hat u \right \} := \underset{ x > \hat x}{\text{arginf}}( f(x) > \hat u)$$.

Case B:



Fig 4.9: Monotonically increasing, left continuous (Case B)

$$\displaystyle \text{min}\left \{ x\mid f(x) > \hat u \right \} := \underset{ x > \hat x}{\text{argmin}} (f(x) > \hat u)$$ doesn't exist. $$\displaystyle \hat x = \text{inf}\left \{ x\mid f(x) > \hat u \right \} := \underset{ x > \hat x}{\text{arginf}}( f(x) > \hat u)$$.

Case C:



Fig 4.10: Left continuous (Case C)

$$\displaystyle \text{min}\left \{ x\mid f(x) > \hat u \right \} := \underset{ x > \hat x}{\text{argmin}} (f(x) > \hat u)$$ doesn't exist, $$\displaystyle \hat x = \text{inf}\left \{ x\mid f(x) > \hat u \right \} := \underset{ x > \hat x}{\text{arginf}}( f(x) > \hat u)$$.

Properties
(increasing case)

Statement I:
 * {| style="width:100%" border="0"

$$\displaystyle \forall a,\, b \in \Omega$$, if $$\displaystyle a \leqslant b $$,then $$\displaystyle F_{X}(a) \leqslant F_{X}(b) $$ (eq4) Where $$\displaystyle \Omega$$ is the increasing domain of function $$\displaystyle F_{X}(x)$$.
 * style="width:92%; padding:10px; border:2px solid #000000" |
 * style="width:92%; padding:10px; border:2px solid #000000" |
 * 
 * }

Proof Fig 4.11: Increasing monotonicity "Strictly monotonic":
 * {| style="width:100%" border="0"

$$\displaystyle a < b \Rightarrow F_{X}(a) < F_{X}(b) $$ (eq5)
 * style="width:95%" |
 * style="width:95%" |
 * 
 * }

Relaxing strictness

Fig 4.12: Horizontal

This function is also monotonic, but not strictly monotonic. Relax the inequality (4) to the following:
 * {| style="width:100%" border="0"

$$\displaystyle a < b \Rightarrow F_{X}(a) \leqslant F_{X}(b) $$ (eq6) Fig 4.13: Vertical
 * style="width:95%" |
 * style="width:95%" |
 * 
 * }

Relax further the inequality (4) to the following:


 * {| style="width:100%" border="0"

$$\displaystyle a \leqslant b \Rightarrow F_{X}(a) \leqslant F_{X}(b) $$ (eq7) End proof
 * style="width:95%" |
 * style="width:95%" |
 * 
 * }

Inverse proposition:
 * {| style="width:100%" border="0"

$$\displaystyle \forall a,\, b \in \Omega$$, if $$\displaystyle F_{X}(a) \leqslant F_{X}(b) $$, then $$\displaystyle a \leqslant b $$. (eq8) Proof Fig 4.14: Increasing monotonicity "Strictly monotonic":
 * style="width:92%; padding:10px; border:2px solid #000000" |
 * style="width:92%; padding:10px; border:2px solid #000000" |
 * 
 * }
 * {| style="width:100%" border="0"

$$\displaystyle F_{X}(a) < F_{X}(b) \Rightarrow a < b $$ (eq9)
 * style="width:95%" |
 * style="width:95%" |
 * 
 * }

Relaxing strictness

Fig 4.15: Horizontal

This function is also monotonic, but not strictly monotonic. Relax the inequality (4) to the following:


 * {| style="width:100%" border="0"

$$\displaystyle F_{X}(a) \leqslant F_{X}(b) \Rightarrow a < b $$ (eq10)
 * style="width:95%" |
 * style="width:95%" |
 * 
 * }

Fig 4.16: Vertical inverse

Relax further the inequality (4) to the following:


 * {| style="width:100%" border="0"

$$\displaystyle F_{X}(a) \leqslant F_{X}(b) \Rightarrow a \leqslant b $$ (eq11) End proof
 * style="width:95%" |
 * style="width:95%" |
 * 
 * }

Monotonicity of Inverse
Statement II:


 * {| style="width:100%" border="0"

If $$\displaystyle F_{X}(x)$$ is monotonic, then $$\displaystyle F_{X}^{-1}(x)$$ has the same monotonicity. Proof Suppose $$\displaystyle F_{X}(x)$$ is increasing monotonic, then we have
 * style="width:92%; padding:10px; border:2px solid #000000" |
 * style="width:92%; padding:10px; border:2px solid #000000" |
 * }


 * {| style="width:100%" border="0"

$$\displaystyle a \leqslant b \Leftrightarrow F_{X}(a) \leqslant F_{X}(b) $$ (eq12)
 * style="width:95%" |
 * style="width:95%" |
 * 
 * }

For a increasing monotonic function $$\displaystyle F_{X}(x)$$,


 * {| style="width:100%" border="0"

$$\displaystyle a = F_{X}^{-1}[F_{X}(a)] $$ (eq13) and
 * style="width:95%" |
 * style="width:95%" |
 * 
 * }
 * {| style="width:100%" border="0"

$$\displaystyle b = F_{X}^{-1}[F_{X}(b)] $$ (eq14) Substituting Equation( 12 )&( 13 ) into ( 11 ) we have
 * style="width:95%" |
 * style="width:95%" |
 * 
 * }
 * {| style="width:100%" border="0"

$$\displaystyle F_{X}^{-1}[F_{X}(a)] \leqslant F_{X}^{-1}[F_{X}(b)] \Leftrightarrow F_{X}(a) \leqslant F_{X}(b) $$ (eq15)
 * style="width:95%" |
 * style="width:95%" |
 * 
 * }

This argument is true iff $$F_{X}^{-1}(x)$$ is a monotonically increasing function.

Hence, $$\displaystyle F_{X}^{-1}(x)$$ has the same monotonicity as the original function $$\displaystyle F_{X}(x)$$. End proof

For monotonically decreasing function, these properties also valid.

See also [[media: My_notes_xiu_p15.pdf | my notes on Xiu 2010 p.15]] and [[media: Hailong.2011_10_20_09_15_25.djvu | my notes 2011.10.20]].

Here are [[media: Hailong.2011 10 27 08 21 02.pdf | my notes for 2011.10.27 and 2011.11.03]].

Conclusion 1
Due to the monotonicity of distribution function $$\displaystyle F_X(x)$$, we have
 * {| style="width:100%" border="0"

$$\displaystyle u \leqslant F_{X}(x)\Leftrightarrow F_{X}^{-1}(u)\leqslant x$$ (eq16)
 * style="width:92%; padding:10px; border:2px solid #000000" |
 * style="width:92%; padding:10px; border:2px solid #000000" |
 * <p style="text-align:right">
 * }

Proof of Point 2
$$\displaystyle \color{red}{\spadesuit}$$ <b> NEW: </b>

[[media: Hailong.2011 11 10 08 35 29.pdf | My notes for 2011.11.10]], [[media: Hailong.2011 11 17 08 53 10.pdf | My notes for 2011.11.17]], [[media: Hailong.2011 11 24 08 13 27.pdf | My notes for 2011.11.24]], [[media: Hailong.2011 12 01 08 16 35.pdf | My notes for 2011.12.01]],

Special case: Cauchy distribution: Fig 4.17: Cauchy distribution mechanism

Above figure is the probability interpretation of light, in which $$\displaystyle \theta \in [ -\frac{\pi}{2}, +\frac{\pi}{2} ]$$, a random variable with uniform probability distribution function $$\displaystyle f_{\theta}(\theta)$$, $$\displaystyle x $$ a random variable with probability distribution function $$\displaystyle f_x(x)$$ to be determined from $$\displaystyle f_{\theta}(\theta)$$.


 * {| style="width:100%" border="0"

$$\displaystyle x = d\,\tan(\theta) $$ (eq17)
 * style="width:95%" |
 * style="width:95%" |
 * <p style="text-align:right;">
 * }


 * {| style="width:100%" border="0"

$$\displaystyle X = d \, \tan(\Theta) $$ (eq18) Sometimes, $$\displaystyle d $$ in eq (17)&(18) can be omitted since $$\displaystyle \,\left | \tan(\pm \frac{\pi}{2}) \right | = \infty $$. Fig 4.18: tangent function
 * style="width:95%" |
 * style="width:95%" |
 * <p style="text-align:right;">
 * }

We define $$\displaystyle \theta (u)$$ is the mapping from $$\displaystyle [0,\,1] $$ to $$\displaystyle [ -\frac{\pi}{2}, +\frac{\pi}{2} ]$$, i.e.,
 * {| style="width:100%" border="0"

$$\displaystyle \theta (u): [0,\,1] \rightarrow [ -\frac{\pi}{2}, +\frac{\pi}{2} ]. $$ (eq19) Then we have
 * style="width:95%" |
 * style="width:95%" |
 * <p style="text-align:right;">
 * }
 * {| style="width:100%" border="0"

$$\displaystyle \theta (u) = \pi ( u - \frac{1}{2} ). $$ (eq20) Hence, according to eq (17),
 * style="width:95%" |
 * style="width:95%" |
 * <p style="text-align:right;">
 * }
 * {| style="width:100%" border="0"

$$\displaystyle x = \tan(\pi ( u - \frac{1}{2} )) = F_X^{-1}( u ). $$ (eq21) We can solve for $$\displaystyle u $$ from eq (21), as
 * style="width:95%" |
 * style="width:95%" |
 * <p style="text-align:right;">
 * }
 * {| style="width:100%" border="0"

$$\displaystyle u = \frac{1}{\pi}\tan^{-1}( x )+ \frac{1}{2} = F_X(x) \in [0,\,1]. $$ (eq22) Fig 4.19: Cauchy cumulative distribution function
 * style="width:95%" |
 * style="width:95%" |
 * <p style="text-align:right;">
 * }


 * {| style="width:100%" border="0"

$$\displaystyle \frac{dF_X(x)}{dx} = \frac{du}{dx} = f_X(x) = \frac{1}{\pi}\frac{1}{1+x^2}. $$ (eq23) Fig 4.20: Cauchy probability density function
 * style="width:95%" |
 * style="width:95%" |
 * <p style="text-align:right;">
 * }

temp
General case: The statement of point 2 is equivalent to: Given $$\displaystyle U$$ is uniform in $$\displaystyle (0,1)$$, then$$\displaystyle P_{\mathbf X}(F_{\mathbf X}^{-1}(U) < x) = F_{\mathbf X}(x)$$. From Xiu 2010, p.15 equation 2.7, we already have
 * {| style="width:100%" border="0"

$$\displaystyle F_{X}^{-1}(u) := \text{inf} \{x\mid F_{X}(x) \geqslant u, 0 < u < 1 \}. $$
 * style="width:95%" |
 * style="width:95%" |
 * }

Fig 4.21: Left continuity function

By definition of $$\displaystyle F_{X}^{-1}$$, the probability
 * {| style="width:100%" border="0"

$$\displaystyle P_{X}(F_{X}^{-1}(U) < x) = P_{X}(\text{inf} \{y \mid F_{X}(y) \geqslant U \} < x )$$ (eq23)
 * style="width:95%" |
 * style="width:95%" |
 * <p style="text-align:right;">
 * }

For left continuous cdf, for any given $$\displaystyle u$$, we have
 * {| style="width:100%" border="0"

$$\displaystyle F_{X}( \text{inf} \{y\mid F_{X}(y) \geqslant U\}) \leqslant U < F_{X}(x) $$
 * style="width:95%" |
 * style="width:95%" |
 * }

In Fig 4.21, that is
 * {| style="width:100%" border="0"

$$\displaystyle \hat u := F_{X}(\hat x)=F_{X}( \text{inf} \{x\mid F_{X}(x) \geqslant \tilde u\}) \leqslant \tilde u < F_{X}(\tilde x) $$ Hence, applying $$\displaystyle F_{X}$$ on both sides of inequality of equation(23), we have $$\displaystyle P_{X}(F_{X}(\text{inf} \{y \mid F_{X}(y) \geqslant U \}) < F_{X}(x))$$ Since $$\displaystyle U $$ is uniformly distributed in $$\displaystyle (0,1)$$, then $$\displaystyle P_{X}(U < F_{X}(x)) = F_{X}(x) $$ (See Fig 4.2)
 * style="width:95%" |
 * style="width:95%" |
 * }

For right-continuous cdf, refer to Proof of correctness of Inverse transform sampling method from Wikipedia.

Conclusion 2
If $$\displaystyle U$$ is uniform in $$\displaystyle \left ( 0,1 \right )$$, then $$\displaystyle F_{X}^{-1}(U)$$ has distribution function $$\displaystyle F_{X}$$.

Proof of Point 3
By definition of cdf, we have
 * {| style="width:100%" border="0"

$$\displaystyle P_{X}(X < x) = F_{X}(x)$$ (eq24) Since $$\displaystyle F_{X} $$ is continuous, we can find a unique value $$\displaystyle X$$, such that
 * style="width:95%" |
 * style="width:95%" |
 * <p style="text-align:right;">
 * }
 * {| style="width:100%" border="0"

$$\displaystyle X = F_{X}^{-1}(U)$$ (eq25) Substituting equation(25) into equation(24), we can obtain $$\displaystyle P_{X}(F_{X}^{-1}(U) < x) = F_{X}(x)$$. Applying $$\displaystyle F_{X}$$ on both sides, since $$\displaystyle F_{X}$$ is non-decreasing, above equation can be further expressed as $$\displaystyle P_{X}(F_{X}(F_{X}^{-1}(U)) \leqslant F_{X}(x)) = F_{X}(x)$$. That is to say $$\displaystyle P_{X}(U \leqslant F_{X}(x)) = F_{X}(x)$$. It's clearly that $$\displaystyle F_{X}(x) \in \left ( 0,1 \right )$$. Thus, $$\displaystyle F_{X}(x)$$ is uniform in $$\displaystyle \left ( 0,1 \right )$$.
 * style="width:95%" |
 * style="width:95%" |
 * <p style="text-align:right;">
 * }

Conclusion 3
If $$\displaystyle F_{X}$$ is continuous, then $$\displaystyle F_{X}(x)$$ is uniform in $$\displaystyle \left ( 0,1 \right )$$.

Left continuous OR Right continuous
There are two kinds of definitions for cumulative distribution function. The left/right continuity depends on how you define the cdf. Apparently, according to the first definition, cdf is left continuous, while right continuous for the second definition.

Definition I:

Pope 2000, p.38 equation (3.7)
 * {| style="width:100%" border="0"

$$\displaystyle F(V)\equiv P\{ U < V\}$$ (15)
 * style="width:95%" |
 * style="width:95%" |
 * <p style="text-align:right;">
 * }

Kolmogorov 1956, p.23
 * {| style="width:100%" border="0"

$$\displaystyle F^{(x)}(a) = P^{(x)}\{ -\infty, a\} = P\{x < a\}$$ (16)
 * style="width:95%" |
 * style="width:95%" |
 * <p style="text-align:right;">
 * }

"...$$\displaystyle F^{(x)}(a)$$ is continuous on the left."

Definition II:

Xiu 2010, p.9 equation (2.1)
 * {| style="width:100%" border="0"

$$\displaystyle F_{X}(x) = P(X \leqslant x)=P(\{\omega:X(\omega) \leqslant x \}), x\in \mathbb R$$ (17)
 * style="width:95%" |
 * style="width:95%" |
 * <p style="text-align:right;">
 * }

Shao 1999, p.4 equation (1.4)
 * {| style="width:100%" border="0"

$$\displaystyle F(x) = P((-\infty, x ]), x\in \mathbb R$$ (18)
 * style="width:95%" |
 * style="width:95%" |
 * <p style="text-align:right;">
 * }

"$$\displaystyle F$$ is right continuous ..."

Durrett 2010, p.10
 * {| style="width:100%" border="0"

$$\displaystyle F(x) = P(X \leqslant x )$$ (19)
 * style="width:95%" |
 * style="width:95%" |
 * <p style="text-align:right;">
 * }

"$$\displaystyle F$$ is right continuous ..."

Note:

Most of the properties of these two versions of cdf are the same, except one important difference is that definition I is left continuous and definition II is right continuous. See N. Balakrishnan, Valery B. Nevzorov 2003, p.2 or Google books.

Using cdf, we can calculate the probability at any point in the domain as

For left continuous cdf, $$\displaystyle P_X(x) = F_X(x) - F_X(x^-) = 0$$, while $$\displaystyle P_X(x) \neq F_X(x^+) - F_X(x) \neq 0$$.

For right continuous cdf, $$\displaystyle P_X(x) = F_X(x^+) - F_X(x) = 0$$, while $$\displaystyle P_X(x) \neq F_X(x) - F_X(x^-) \neq 0$$.

Verification of left continuity or right continuity
Continuity can be defined in terms of limits of sequences, see Definition in terms of limits of sequences.

Left continuity:

For any sequence $$\displaystyle [x_i]_{i=1,2,...,+\infty} \leqslant \hat x$$ of points in the domain tends to $$\displaystyle \hat x$$ from the bottom, the equivalent $$\displaystyle [f(x_i)]_{i=1,2,...,+\infty}$$ tends to $$\displaystyle f(\hat x)$$, i.e., $$\displaystyle f(x^-) = f(x)$$.

Right continuity:

For any sequence $$\displaystyle [x_i]_{i=1,2,...,+\infty} \geqslant \hat x$$ of points in the domain tends to $$\displaystyle \hat x$$ from the top, the equivalent $$\displaystyle [f(x_i)]_{i=1,2,...,+\infty}$$ tends to $$\displaystyle f(\hat x)$$, i.e., $$\displaystyle f(x^+) = f(x)$$.

For definition I: $$\displaystyle F_X(\hat x) = P_X(X < \hat x)$$

We assume that there is a sequence that satisfy $$\displaystyle x_1 < x_2 < x_3 < ... < x_{+\infty} \leqslant \hat x$$. Since that cdf is non-decreasing, we have $$\displaystyle F_X(x_1) < F_X(x_2) < F_X(x_3) < ... < F_X(x_{+\infty}) \leqslant F_X(\hat x)$$. That's to say $$\displaystyle F_X(\hat x^-) = F_X(\hat x)= P_X(X < \hat x)$$. Hence, $$\displaystyle F_X(x)$$ is left continuous.

Actually, Kolmogorov has proved this statement in Kolmogorov 1956, p.23.

For definition II: $$\displaystyle F_X(\hat x) = P_X(X \leqslant \hat x) = 1 - P_X(X > \hat x)$$

We also assume there is a sequence that satisfy $$\displaystyle x_{+\infty} > ... > x_3 > x_2 > x_1 \geqslant \hat x$$. Any cdf is non-decreasing, we have $$\displaystyle F_X(x_{+\infty}) > ... > F_X(x_3) > F_X(x_2) > F_X(x_1) \geqslant F_X(\hat x)$$, i.e., $$\displaystyle F_X(\hat x^+) = F_X(\hat x)= 1 - P_X(X > \hat x) = P_X(X \leqslant \hat x)$$. Therefore, $$\displaystyle F_X(x)$$ is right continuous.

Inverse of CDF
For strictly increasing continuous cdf, the real number $$\displaystyle F_{X}^{-1}(u), 0<u<1 $$ can be uniquely determined. But in general, the cdf isn't invertible. For this case, we can use generalized inverse cumulative distribution function. Both $$\displaystyle inf$$ and $$\displaystyle sup$$ can be used to define the inverse of cdf.

For left continuous cdf: Fig 5: Left Continuous CDF Define the inverse of cdf as When $$\displaystyle u \in [u_1, u_2]$$
 * {| style="width:100%" border="0"

$$\displaystyle F_{X}^{-1}(u):= inf\{x\mid F_{X}(x) > u\}$$ (20) or
 * style="width:95%" |
 * style="width:95%" |
 * <p style="text-align:right;">
 * }
 * {| style="width:100%" border="0"

$$\displaystyle F_{X}^{-1}(u):= sup\{x\mid F_{X}(x) < u\}$$ (21) When $$\displaystyle u \in [-\infty, u_1) \cup (u_2, +\infty]$$
 * style="width:95%" |
 * style="width:95%" |
 * <p style="text-align:right;">
 * }
 * {| style="width:100%" border="0"

$$\displaystyle F_{X}^{-1}(u):= \{x\mid F_{X}(x) = u\} = inf\{x\mid F_{X}(x) = u\} = sup\{x\mid F_{X}(x) = u\}$$ (22) Conclusively, for this case, we can define the inverse of cdf as
 * style="width:95%" |
 * style="width:95%" |
 * <p style="text-align:right;">
 * }
 * {| style="width:100%" border="0"

$$\displaystyle F_{X}^{-1}(u):= inf\{x\mid F_{X}(x) > u\}$$ (22) or
 * style="width:95%" |
 * style="width:95%" |
 * <p style="text-align:right;">
 * }
 * {| style="width:100%" border="0"

$$\displaystyle F_{X}^{-1}(u):= sup\{x\mid F_{X}(x) < u\}$$ (24)
 * style="width:95%" |
 * style="width:95%" |
 * <p style="text-align:right;">
 * }

For right continuous cdf: Fig 6: Right Continuous CDF Define the inverse of cdf as When $$\displaystyle u \in [u_1, u_2]$$
 * {| style="width:100%" border="0"

$$\displaystyle F_{X}^{-1}(u):= inf\{x\mid F_{X}(x) \geqslant u\}$$ (25) or
 * style="width:95%" |
 * style="width:95%" |
 * <p style="text-align:right;">
 * }
 * {| style="width:100%" border="0"

$$\displaystyle F_{X}^{-1}(u):= sup\{x\mid F_{X}(x) \leqslant u\}$$ (26) When $$\displaystyle u \in [-\infty, u_1) \cup (u_2, +\infty]$$
 * style="width:95%" |
 * style="width:95%" |
 * <p style="text-align:right;">
 * }
 * {| style="width:100%" border="0"

$$\displaystyle F_{X}^{-1}(u):= \{x\mid F_{X}(x) = u\} = inf\{x\mid F_{X}(x) = u\} = sup\{x\mid F_{X}(x) = u\}$$ (27) Conclusively, for this case, we can define the inverse of cdf as
 * style="width:95%" |
 * style="width:95%" |
 * <p style="text-align:right;">
 * }
 * {| style="width:100%" border="0"

$$\displaystyle F_{X}^{-1}(u):= inf\{x\mid F_{X}(x) \geqslant u\}$$ (28) or
 * style="width:95%" |
 * style="width:95%" |
 * <p style="text-align:right;">
 * }
 * {| style="width:100%" border="0"

$$\displaystyle F_{X}^{-1}(u):= sup\{x\mid F_{X}(x) \leqslant u\}$$ (29)
 * style="width:95%" |
 * style="width:95%" |
 * <p style="text-align:right;">
 * }

Useful links
Useful properties of the inverse cdf.

= discussion on 2011.09.08 =

Question 1
$$\displaystyle \mathcal F = \{ \emptyset, { \rm heads}, { \rm tails}, \mathbf\Omega\}$$.

I think the heads and tails should be subsets, it should be designated as $$\displaystyle \{ \rm heads \}$$ $$\displaystyle \{ \rm tails\}$$. Am i right?

Answer 1
Strictly speaking, you can put the curly brackets around "heads" and "tails" to designate that they are singletons (i.e., sets with only one element). But in general it is not necessary to be so strict, since we can write

$${\rm heads} \in \mathcal F $$

or

$$\{ {\rm heads} \} \in \mathcal F $$

Writing style: Don't do a verbatim transcript of my notes; instead use the style of writing wiki articles as shown in the article Gradient_of_vector:_Two_tensor_conventions, i.e., the mediawiki article should read as an article, not lecture notes.