Slide explaining estimators as random variables using a women’s height sample, showing the sample mean and its variance

Estimators as Random Variables

An estimator is a random variable which we use to estimate an unknown quantity. It must be a function of a random sample, such as

$$Y_1,Y_2,\dots ,Y_n$$

This sequence of random variables might represent, for example, the heights of $n$ women we are planning to select at random from the UK population. They are random because we have not yet selected them – like when we plan to roll a die but have not yet done so.

To say they are a “random sample” means they are independent, all with the same distribution, and hence same mean $\mu$ and variance $\sigma^2$ (when these exist).

A function of $Y_1,Y_2,\dots ,Y_n$ is called a “statistic” – the same as the name of the subject itself!

One important example of an estimator is the sample mean:

$$\bar{Y}=\frac{Y_1+Y_2+\dots +Y_n}{n}$$

We use the sample mean estimator to estimate the population mean, $ \mu = \mathbb{E}(Y_i)$.

Now, since estimators like $\bar{Y}$ are themselves random variables, they also have expectations and variances.

For instance, the expected value of $\bar{Y}$ is:

$ \mathbb{E}(\bar{Y}) = \mathbb{E}(\frac{Y_1+Y_2+\dots +Y_n}{n}) = \frac{\mathbb{E}(Y_1)+\mathbb{E}(Y_2)+\dots +\mathbb{E}(Y_n)}{n}=\frac{\mu + \mu + \dots + \mu}{n}=\mu$

We note that this is the same number we use $\bar{Y}$ to estimate, which is encouraging.

Meanwhile, using the rule for independent random variables, we can also find its variance:

$ \mathbb{V}\text{ar}(\bar{Y}) = \mathbb{V}\text{ar}(\frac{Y_1+Y_2+\dots +Y_n}{n}) = \frac{\mathbb{V}\text{ar}(Y_1)+\mathbb{V}\text{ar}(Y_2)+\dots +\mathbb{V}\text{ar}(Y_n)}{n^2}=\frac{\sigma^2 + \sigma^2 + \dots + \sigma^2}{n^2}=\frac{\sigma^2}{n}$

So, the distribution of our estimator will be less “spread out” if we take a larger sample size; generally, this extra precision is also thought to be a good thing.


Background:


Extensions: