When Are Two Random Variables Independent?
Intuitively, two random variables $X$ and $Y$ are independent if they have no effect on each other. They each “do their own thing”; behaving in their own individual way without being influenced by the other.
To formalise this mathematically in the case that $X$ and $Y$ are discrete, we look at their joint probability mass functions. If they are independent, this will be the product of their individual probability mass functions.
That is, to find the probability of $X$ being equal to $x$ and $Y$ being equal to $y$, both at the same time, we multiply together their individual probabilities:
$$ \mathbb{P}(X=x,Y=y)=\mathbb{P}(X=x)\mathbb{P}(Y=y) $$
Where this formula must hold for all numbers $x$ and $y$.
If $X$ and $Y$ are instead continuous, the idea is the same, but we use their PDFs rather than PMF’s:
$$ f_{X,Y}(x,y) = f_{X}(x) f_{Y}(y) \ \ \text{ for all } \ x,y$$
In fact, we can also deal with both cases at once by using the CDF. That is, $X$ and $Y$ are independent if:
$$ \mathbb{P}(X \le x,Y \le y)=\mathbb{P}(X \le x)\mathbb{P}(Y \le y) \ \ \text{ for all } \ x,y$$
We can also write this as:
$$ F_{X,Y}(x,y) = F_X(x)F_Y(y) \ \ \text{ for all } \ x,y$$
If we have more than two random variables, this is naturally extended. In general, we say that a set $X_1, … ,X_n$ of random variables are independent if:
$$ \mathbb{P}(X_1 \le x_1, \dots , X_n \le x_n)=\mathbb{P}(X_1 \le x_1) \dots \mathbb{P}(X_n \le x_n) \ \ \text{ for all } \ x_1, \dots x_n $$
That is, to get the “joint” CDF, we can multiply the “marginal” ones.
Note that this is different to saying that each pair $X_i$ and $X_j$ are independent, for all $i \ne j$.