What is Sampling?
When we pick an individual from a population to make an observation about them, we think of their response as a random variable. We then think of population characteristics as features of this random variable.
Suppose, for instance, we want to know the average height of a woman in the UK. Let’s call this unknown number $\mu$.
Now let $Y$ represent the height of a random woman, yet to be chosen from the UK population. Then we think of $\mu$ as the expectation of this “population” random variable – that is, $\mu=\mathbb{E}(Y)$.
A fundamental principle of statistics is that we think of $Y$ in the same way as the outcome of a randomisation device, such as the outcome of a fair die roll, $X$.
However, with $X$ we can calculate the expected value:
$$\mathbb{E}(X) = 1\times \frac{1}{6}+2\times \frac{1}{6}+3\times \frac{1}{6}+4\times \frac{1}{6}+5\times \frac{1}{6}+6\times \frac{1}{6}=3.5$$
But with the knowledge we have, there is no such procedure available for $\mathbb{E}(Y)$. For that, we’d have to know the PDF or PMF of women’s heights in the UK – but this is much more complex than what we’re looking for!
We therefore resort to guessing; that is, we must ask some women their heights, and use this data to estimate $\mu=\mathbb{E}(Y)$.