Cumulative distribution function (CDF) explained using normal distribution, showing area under curve to the left of t and relation to PDF

What is a Cumulative Distribution Function (CDF)?

The CDF of $X$ is defined as the probability that $X$ is less than or equal to a certain value – say, $X \le t$.

Recall that the PDF of a continuous random variable $X$ can be used to find the probability that $X$ lies within an interval.

To do this, we just look at the area under the PDF within that interval.

So, the CDF is the area under the PDF to the left of a certain value.

We write this probability as $F_X(t) $, which is defined for any real number $t$. Since it is a probability, we also have $ 0 \le F_X(t) \le 1 $ for all $t$.

The CDF can be seen visually as the area to the left of a certain point $x=t$ on the PDF, as shown above.

Since we find areas by integrating, when $X$ is continuous with PDF $f_X(x)$ we have:

$$F_X(t) = \int_{-\infty}^{t} f_X(x)dx$$

Notice that if we want to use $x$ as the argument of the PDF in the integral, we need a different letter such as $t$ for the argument of the CDF (and the limits of this integral). But sometimes we instead use $x$ as the argument of the CDF.

Since differentiation “undoes” integration, we also have:

$$ f_X(x) = \frac{dF_X(x)}{dx} $$

The CDF is also defined in exactly the same way for a discrete random variable – indeed, for any random variable; as $F_X(x) = \mathbb{P}(X \le x) $.

Background: