Unit 4: Probability, Random Variables, and Probability Distributions

This unit introduces the concept of probability and how it is used to model random phenomena. You will learn how to simulate chance processes, calculate exact probabilities, and understand key types of probability distributions such as binomial and geometric distributions.

Using Simulation to Estimate Probabilities

Simulation is a method used to model random events when theoretical probability is difficult or impossible to calculate directly.

Step 1: State the question of interest clearly.
Step 2: Describe how to model one trial (assign digits to outcomes).
Step 3: Simulate many trials using a random device (random number generator, random digit table, etc.).
Step 4: Record and analyze the results.
Step 5: Use the results to answer the original question.

Example: Estimate the probability of getting 2 heads in 3 coin flips. Assign digits 0–4 = heads, 5–9 = tails. Simulate 3-digit numbers and count the number of heads per trial over 100 simulations.

Calculating the Probability of a Random Event

Sample space: The set of all possible outcomes.
Event: A subset of the sample space. The event "at least one six" in two dice rolls includes all outcomes with at least one 6.

Basic Rules:

$ 0 \leq P(A) \leq 1 $
$ P(S) = 1 $ where $ S $ is the sample space
$ P(A^c) = 1 - P(A) $
$ P(A \cup B) = P(A) + P(B) - P(A \cap B) $

If A and B are mutually exclusive: $ P(A \cap B) = 0 $, so $ P(A \cup B) = P(A) + P(B) $

If A and B are independent: $ P(A \cap B) = P(A) \cdot P(B) $

Tip: Independence means knowing one event occurred does not change the probability of the other. Disjoint (mutually exclusive) events cannot be independent.

Random Variables and Probability Distributions

A random variable assigns numerical values to outcomes of a random phenomenon.

Discrete Random Variable: Takes on a finite or countable number of values.

Continuous Random Variable: Takes on any value in an interval of numbers. Probabilities are calculated over intervals, not exact values.

Probability Distribution: A table or function that gives the probability for each value of a random variable $ X $.

All probabilities must be between 0 and 1 and sum to 1.

A cumulative distribution shows the probability of ≤ each value.

Mean (Expected Value):

\[ \mu_X = E(X) = \sum x_i \cdot P(x_i) \]

Standard Deviation:

\[ \sigma_X = \sqrt{\sum (x_i - \mu_X)^2 \cdot P(x_i)} \]

Example: A game where you win $2 with probability 0.3, $0 with probability 0.6, and lose $1 with probability 0.1. The expected value is:
\[ E(X) = 2(0.3) + 0(0.6) + (-1)(0.1) = 0.6 - 0.1 = 0.5 \]

Tip: The mean tells you the long-run average value. It does not mean that you’ll ever actually win that amount on one play.

Combining Random Variables

For constants $ a $, $ b $, and random variables $ X $ and $ Y $:

Mean of $ aX + bY $: \[ \mu = a\mu_X + b\mu_Y \]

If $ X $ and $ Y $ are independent: \[ \text{Var}(aX + bY) = a^2 \sigma_X^2 + b^2 \sigma_Y^2 \quad \Rightarrow \quad \sigma = \sqrt{a^2 \sigma_X^2 + b^2 \sigma_Y^2} \]

Important: You add variances, NOT standard deviations.

Transforming Random Variables

For a linear transformation $ Y = a + bX $:

Mean: $ \mu_Y = a + b\mu_X $

Standard deviation: $ \sigma_Y = |b| \sigma_X $

Note: Only multiplying/dividing affects spread. Adding/subtracting shifts the distribution but does not change variability. The mean is affected by adding, subtracting, multiplying, or dividing the data by a constant. The standard deviation is only affected by multiplying or dividing.

Binomial Distribution

A binomial setting involves repeated trials of the same chance process, with exactly two possible outcomes: success or failure.

Conditions (BINS):

B: Binary outcomes (success/failure)
I: Independent trials
N: Number of trials is fixed
S: Same probability of success for each trial

Probability Formula:

\[ P(X = k) = \binom{n}{k} p^k (1 - p)^{n - k} \]

Where:

$ n $ = number of trials
$ k $ = number of successes
$ p $ = probability of success

Mean and Standard Deviation:

\[ \mu = np \quad \text{and} \quad \sigma = \sqrt{np(1 - p)} \]

Example: Suppose 80% of students pass an exam. What is the probability that exactly 4 out of 5 students pass?

\[ P(X = 4) = \binom{5}{4} (0.8)^4 (0.2)^1 = 5 \cdot 0.4096 \cdot 0.2 = 0.4096 \]

Binomial Distribution Tips

Use binompdf(n, p, x) for exact probabilities: $ P(X = x) $

Use binomcdf(n, p, x) for cumulative: $ P(X \leq x) $

To find $ P(X \geq x) $, use: \[ 1 - \text{binomcdf}(n, p, x - 1) \]

To find $ P(a \leq X \leq b) $, use: \[ \text{binomcdf}(n, p, b) - \text{binomcdf}(n, p, a - 1) \]

Example: What's the probability of getting at least 3 successes in 7 trials if $ p = 0.6 $?
\[ P(X \geq 3) = 1 - \text{binomcdf}(7, 0.6, 2) \]

Geometric Distribution

A geometric distribution describes the number of trials needed to get the first success.

Conditions (BITS):

B: Binary outcomes
I: Independent trials
T: Trials continue until the first success
S: Same probability of success

Probability Formula:

\[ P(X = k) = (1 - p)^{k - 1} \cdot p \]

Mean and Standard Deviation:

\[ \mu = \frac{1}{p} \quad \text{and} \quad \sigma = \sqrt{\frac{1 - p}{p^2}} \]

Example: If the probability of success is 0.25, what is the probability that the first success is on the 3rd trial?

\[ P(X = 3) = (0.75)^2 \cdot 0.25 = 0.1406 \]

Tip: The geometric distribution is memoryless: the probability of success on the $ k^{\text{th}} $ trial does not depend on what happened before.

Geometric Distribution Tips

Use geometpdf(p, x) for exact: $ P(X = x) $

Use geometcdf(p, x) for: $ P(X \leq x) $

To find $ P(X > x) $, use: \[ 1 - \text{geometcdf}(p, x) \]

Example: What’s the probability that the first success occurs after 4 trials if $ p = 0.3 $?
\[ P(X > 4) = 1 - \text{geometcdf}(0.3, 4) \]

Summary: Comparing Binomial and Geometric

Feature	Binomial	Geometric
Number of trials	Fixed	Until first success
Mean	$ \mu = np $	$ \mu = \frac{1}{p} $
Distribution shape	Symmetric if $ p = 0.5 $; skewed otherwise	Always right-skewed

Calculator Hacks for Probability Distributions (TI-84/CE)

Use your calculator efficiently to find binomial and geometric probabilities. The key functions are under:

2nd → VARS (this opens the DISTR menu)
Scroll down to access:
binompdf(n, p, x): Finds $ P(X = x) $
binomcdf(n, p, x): Finds $ P(X \leq x) $
geometpdf(p, x): Finds $ P(X = x) $
geometcdf(p, x): Finds $ P(X \leq x) $