What Is a Conditional Expectation?
The conditional expectation of a random variable $Y$ is our updated prediction about it after learning the value of a related random variable $X$.
That is, we may know the value of $X$, and use this to form a new prediction about the value of $Y$. The conditional expectation is our predicted value of one variable in terms of the other.
For instance, consider the following experiment. Suppose I have a bag with three balls, labelled $1,2,3$. I also have a fair coin.
I first pick a ball at random from the bag, then read the number and flip the coin that many times.
Let $X$ be the number shown on the ball, and $Y$ the number of Heads obtained from flipping the coin $X$ times.
Once $X$ is known, we can then form a conditional prediction about $Y$ in light of this information. For instance, if $X$ is 1, this means we will flip the coin once. Since Heads and Tails are equally likely, the expected number of Heads is then $0.5$. We write this as:
$$\mathbb{E}(Y \mid X=1) = 0.5 $$
This is read as: “the expected value of $Y$, given that $X=1$, is $0.5$”.
Likewise, we have:
$$\mathbb{E}(Y \mid X=2) = 1 \ \ \ $$
$$\mathbb{E}(Y \mid X=3) = 1.5 $$
Now let $x$ be any number that $X$ might take. The above results can be summarised in the following formula:
$$\mathbb{E}(Y \mid X=x) = 0.5x, \quad x = 1,2,3 $$
You can check this gives the correct value in each case. The formula should also make intuitive sense, since it is proportional to the number of coin flips we will do.
Finally, we can replace $x$ with $X$ in this expression to write the following:
$$\mathbb{E}(Y \mid X) = 0.5X $$
This we might think of as a “random predictor”; it is a function of $X$ that represents our prediction about $Y$, but from the point of view where $X$ is still random – that is, before drawing a ball from the bag.
Note that unlike $0.5x$, the quantity $\mathbb{E}(Y \mid X) = 0.5X $ is itself a random variable.