Probability Distributions: Continuous Distributions (Introduction)

This lesson introduces continuous probability distributions, a crucial concept in data science. You'll learn the difference between discrete and continuous variables, explore key examples of continuous distributions like the uniform distribution, and understand how to calculate probabilities within these distributions.

Learning Objectives

  • Define and differentiate between discrete and continuous random variables.
  • Explain the concept of probability density function (PDF).
  • Describe the uniform distribution and calculate probabilities using it.
  • Recognize real-world scenarios where continuous distributions apply.

Text-to-Speech

Listen to the lesson content

Lesson Content

Discrete vs. Continuous Random Variables

In statistics, a random variable is a variable whose value is a numerical outcome of a random phenomenon. There are two main types: discrete and continuous.

  • Discrete Random Variables: These can only take on a finite number of values, or a countably infinite number of values. They are usually whole numbers. Examples include: the number of heads when flipping a coin 5 times (can be 0, 1, 2, 3, 4, or 5), the number of cars passing a point in an hour, or the number of students in a class. The probability is calculated for each specific value.

  • Continuous Random Variables: These can take on any value within a given range. Examples include: height, weight, temperature, or the time it takes to complete a task. For continuous variables, we cannot calculate the probability of a specific value; instead, we calculate the probability of a value falling within a range. Think of measuring height. You might say a person is 5 feet tall, but they could also be 5.01 feet tall, 5.001 feet tall, etc. It's impossible to list every possible value.

Example:

  • Discrete: Number of siblings (0, 1, 2, 3, ...)
  • Continuous: Height of a person (e.g., 5.8 feet, 6.1 feet, etc.)

Probability Density Function (PDF)

For continuous random variables, we use a Probability Density Function (PDF), often denoted as f(x). The PDF describes the relative likelihood of a random variable taking on a given value. It's represented as a curve, and the area under the curve between two points represents the probability that the variable falls within that range.

  • The area under the entire PDF curve always equals 1 (representing 100% probability).
  • The y-axis of a PDF represents probability density, NOT probability. Probability is found by calculating the area under the curve for a specific range of values.
  • The probability of a continuous variable taking on a specific value is technically 0 (because the width of that single point is infinitely small, and area = width * height).

The Uniform Distribution

The uniform distribution is the simplest continuous distribution. It's characterized by a constant probability density across a defined range. This means that all values within that range have an equal chance of occurring. Imagine a random number generator that produces numbers between 0 and 1; the uniform distribution describes this.

  • Definition: A continuous random variable X is uniformly distributed on the interval [a, b] if its PDF is:

    • f(x) = 1 / (b - a) for a ≤ x ≤ b
    • f(x) = 0 otherwise
  • Probability Calculation: To find the probability that X falls between two values, say c and d (where a ≤ c ≤ d ≤ b), you calculate the area under the PDF curve between c and d. This is just a rectangle, so the area (probability) is:

    P(c ≤ X ≤ d) = (d - c) / (b - a)

Example:

Suppose a random variable X follows a uniform distribution between 0 and 10.
* a = 0, b = 10
* The PDF is f(x) = 1/10 for 0 <= x <= 10, and 0 otherwise.
* What's the probability that X is between 2 and 5? P(2 <= X <= 5) = (5 - 2) / (10 - 0) = 3/10 = 0.3 or 30%

Progress
0%