Slide explaining the method of moments in statistics, showing how sample moments estimate population moments, with formulas for moment estimators and skewness

What is the Method of Moments?

The method of moments works by first writing quantities of interest in terms of the moments of a distribution, and then replacing these moments with their standard estimators to give estimators of these quantities of interest.

Recall that if $Y$ is a random variable, its moments are the expected values of its powers:

$$\mathbb{E}(Y), \ \ \mathbb{E}(Y^2), \ \ \mathbb{E}(Y^3), \ \ \dots$$

Let’s write the $r^{th}$ moment $\mathbb{E}(Y^r)$ as $M_r$.

Now, suppose we have some data drawn from this distribution:

$$y_1, y_2, \dots , y_n$$

It is natural to estimate $M_r$ by looking at the sample average of $Y^r$.

Let’s write $m_r$ for this estimate of $M_r$:

$$m_r = \frac{y_1^r+y_2^r+y_3^r+\dots + y_n^r}{n} = \frac{1}{n} \sum_{i=1}^n y_i^r $$

That is, we look at the average of the $r^{th}$ power of each data point.

Now, other quantities of interest can often be written in terms of the moments.

For instance, consider the skewness of a random variable:

$$ \gamma_1(Y) = \mathbb{E} \left( \left( \frac{Y-\mu}{\sigma} \right)^3 \right)$$

This can be written in terms of the moments as follows:

$$ \gamma_1(Y) = \frac { \mathbb{E} \left( \left(Y-\mu \right)^3 \right)}{\sigma^3} = \frac { \mathbb{E} (Y^3)-3\mu \ \mathbb{E}(Y^2)+3 \mu^2 \ \mathbb{E}(Y)-\mu^3}{(\mathbb{V}\text{ar}(Y))^{3/2}} = \frac { M_3-3M_1M_2+2M_1^3}{(M_2-M_1^2)^{3/2}} $$

Replacing true moments with their sample estimators, on the numerator we get:

$$ \left(\frac{1}{n}\sum_{i=1}^n Y_i^3 \right)-3 \left(\frac{1}{n} \sum_{i=1}^n Y_i \right) \left(\frac{1}{n} \sum_{i=1}^n Y_i^2 \right) +2 \left( \frac{1}{n} \sum_{i=1}^n Y_i \right)^3$$

and on the denominator:

$$\left( \left( \frac{1}{n} \sum_{i=1}^n Y_i^2 \right) - \left( \frac{1}{n} \sum_{i=1}^n Y_i \right) ^2 \right)^{3/2}$$

Taking the ratio of these random variables, we obtain the method of moments estimator for the unknown population value $\gamma_1(Y)$.


Background: