User talk:Hylon.chen/Probability concepts

Put your questions here.

= discussion 2011.09.22 =

 Question 5: 

In order to generate non-uniform random variables, the most straightforward approach is via the inversion of the distribution function, for example, we can directly obtain the exponential random variable by inverting the exponential distribution. Given the exponential distribution density function is

$$\displaystyle f_{X}(x)=ae^{-ax}, a> 0$$

It's easy to get the distribution function as

$$\displaystyle F_{X}(x)=\int_{0}^{x }f_{X}(y)dy=\int_{0}^{x }ae^{-ay}dy=1-e^{-ax}$$.

Hence we have the exponential random variable by inverting the distribution function as

$$\displaystyle F_{X}^{-1}(u)=-\frac{log(1-x)}{a}$$

But for most cases that $$\displaystyle F_{X}^{-1} $$ isn't available in explicit form, which we should use approximation instead, like the normal distribution.

My question is can you explain in details how can we use approximation method to generate those random variables?

 Question 6: 

From Xiu 2010, p.19, we know that the conditional distribution function of a random variable $$\displaystyle X$$ given $$\displaystyle B$$, provide that $$\displaystyle P(B)>0$$, is

$$\displaystyle F_{X}(x\mid B)=\frac{P(X\leqslant x,B)}{P(B)}, x\in \mathbb{R}$$,

and also the conditional expectation of $$\displaystyle X$$ given $$\displaystyle B$$ is

$$\displaystyle E[X\mid B]=\frac{E[XI_{B}]}{P(B)}$$,

Where $$\displaystyle I_{B}(\omega)=\left\{\begin{matrix} 1,\;\; \omega \in B,\\ 0,\;\;\omega \notin B, \end{matrix}\right. $$

is the indicator function of the event $$\displaystyle B$$.

My question is how can we get this two expressions?

Note: The conditional probability of $$\displaystyle A$$ given $$\displaystyle B$$ is

$$\displaystyle P(A\mid B)=\frac{P(A\cap B)}{P(B)}$$.

= discussion on 2011.09.15 =

 Question 2: 

By your definition of the $$\sigma\,\;$$-field, $$\mathcal F $$, it's the collection of all subsets on $$\Omega$$. But from Xiu 2010, p.10, the $$\sigma\,\;$$-field, $$\mathcal F $$, is defined as the collection of subsets on $$\Omega$$. So, which one is better?

 Answer 2: 

Clearly the definition of a $$\sigma\,\;$$-field $$\mathcal F $$ that includes all subsets of $$\Omega$$ is a stronger statement than the definition that only includes a collection of subsets of $$\Omega$$. The definition that is weaker is more general, i.e., the one that only includes a collection of subsets in $$\Omega$$. We need to look up some references to verify, e.g., some mathematical statistics / probability books.

Consider a counter example to the statement that we only need a collection of subsets of $$\Omega$$ to form a sigma-field: $$\Omega = {1, 2, 3}$$ $$\mathcal F :={\emptyset, 1, 2, \Omega}$$ $${1}\cup {2}={1, 2} \notin \mathcal F$$ Clearly, $$\mathcal F$$ can't be a sigma-field. The point here is that you can't take any arbitrary collection of subsets of $$\Omega$$ to form a sigma-field, but you need to take a collection of subsets of $$\Omega$$ that satisfies 3 conditions for the set $$\mathcal F$$ to be a sigma-field.

Three conditions that the sigma-field must satisfy:

$$\bullet$$ Not empty:$$\Omega \in \mathcal F$$ and $$\emptyset \in \mathcal F$$;

$$\bullet$$ Given $$A \in \mathcal F$$, then $$A^c \in \mathcal F$$;

$$\bullet$$ Given $$A_1$$, $$A_2$$,...$$\in \mathcal F$$, then

$$\bigcup_{i=1}^{\infty} A_{i} \in \mathcal F$$ and $$\bigcap_{i=1}^{\infty} A_{i} \in \mathcal F$$.

 Question 3: 

For continuous distribution, the probability of any particular value is zero, $$\displaystyle P(X=x)=0, \forall x \in \mathbb R$$. Mathematically, I can understand this concept, that's since the distribution is continuous, we have

$$\displaystyle \lim_{\epsilon \rightarrow 0}F_{X}(x+\epsilon )=F_{X}{(x)}, \forall x$$.

But intuitively, I can't fully accept it. Why the probability of any particular value is zero?

 Answer 3: 

For discrete events, like throwing a dice with 6 facets, you can talk about the probability of having, say, number 3, and that's 1/6.

But for continuous events, you need to think in terms of the area under the curve, similar to using the trapezoidal rule to integrate.

A better explanation would be that there are infinitely many real numbers; the set of real numbers, unlike the set of integers, is very dense. For example, given any two numbers that are very close to each other by $$\displaystyle \epsilon$$, with $$\displaystyle \epsilon$$ being very small (as small as you can imagine, e.g., $$\displaystyle 10^{-36}$$), you can always find infinitely many real numbers in between these two numbers.

So for continous events described by real numbers, the probability of throwing a "generalized dice" to get a particular real number is zero !

 Question 4: 

In Xiu 2010, p.15, proposition 2.11 states that, Let $$\displaystyle F_{X}(x)=P(X\leqslant x)$$ be the distribution function of $$X$$. Then the following results hold.

$$\bullet$$ $$\displaystyle u\leq F_{X}(x)\Leftrightarrow F_{X}^{-1}(u)\leqslant x$$.

$$\bullet$$ If $$U$$ is uniform in $$\displaystyle \left ( 0,1 \right )$$, then $$\displaystyle F_{X}^{-1}(U)$$ has distribution function $$\displaystyle F_{X}$$.

$$\bullet$$ If $$\displaystyle F_{X}$$ is continuous, then $$\displaystyle F_{X}(x)$$ is uniform in $$\displaystyle \left ( 0,1 \right )$$.

Can you explain above proposition?

 Answer 4: 



Install inkscape to plot the figures that I drew in the above [[media:Hailong.2011_09_15_10_31_23.pdf |Explanation]]. See Writing tools.

= discussion on 2011.09.08 =

<b> Question 1: </b>

$$\displaystyle \mathcal F = \{ \emptyset, { \rm heads}, { \rm tails}, \mathbf\Omega\}$$.

I think the heads and tails should be subsets, it should be designated as $$\displaystyle \{ \rm heads \}$$ $$\displaystyle \{ \rm tails\}$$. Am i right?

<b> Answer to Q1: </b> Strictly speaking, you can put the curly brackets around "heads" and "tails" to designate that they are singletons (i.e., sets with only one element). But in general it is not necessary to be so strict, since we can write

$${\rm heads} \in \mathcal F $$

or

$$\{ {\rm heads} \} \in \mathcal F $$

<b> -END Answer to Q1- </b>

Writing style: Don't do a verbatim transcript of my notes; instead use the style of writing wiki articles as shown in the article Gradient_of_vector:_Two_tensor_conventions, i.e., the mediawiki article should read as an article, not lecture notes.