Unit 9: Inference for Quantitative Data: Slopes

In this unit, you'll learn how to conduct inference on the slope of a population regression line using data from a linear regression model. You’ll construct confidence intervals, perform hypothesis tests, and select the appropriate procedure for bivariate quantitative data.

Confidence Intervals for the Slope of a Regression Model

We can estimate the true slope of the population regression line using a confidence interval.

Conditions for Regression Inference (LINE)

Linear: The relationship between the variables is linear.
Independent: The observations are independent (check the design or residual plot).
Normal: The residuals are approximately normally distributed.

Equal Variance: The residuals show no clear pattern and have constant spread (check residual plot).

Confidence Interval Formula

\[ b \pm t^* \cdot SE_b \] where:

\( b \) = sample slope
\( t^* \) = critical value from the t-distribution with \( df = n - 2 \)
\( SE_b \) = standard error of the slope (from computer or calculator output)

Example

A linear regression model estimates how temperature affects sales of ice cream. The slope of the least-squares line is 2.3 with standard error 0.5 and \( df = 18 \). A 95% confidence interval is:

\[ 2.3 \pm t^* \cdot 0.5 = 2.3 \pm 2.101 \cdot 0.5 = (1.25, 3.35) \]

Calculator Tip

- Run LinRegTInt on your calculator.
- Enter the x- and y-lists, confidence level, and choose "Calculate".
- It returns the interval for the slope \( b \).

Hypothesis Test for the Slope of a Regression Model

We can test whether there is a significant linear relationship between two quantitative variables by testing whether the slope of the population regression line is zero.

Hypotheses

\( H_0: \beta = 0 \) (no linear relationship)
\( H_a: \beta \ne 0 \), \( \beta > 0 \), or \( \beta < 0 \) (depending on context)

Test Statistic

\[ t = \frac{b - \beta_0}{SE_b} \] where:

\( b \) = sample slope
\( \beta_0 \) = hypothesized slope (usually 0)
\( SE_b \) = standard error of slope

Degrees of Freedom

\( df = n - 2 \)

Example

A regression of hours studied vs. exam scores yields \( b = 4.1 \), \( SE_b = 1.3 \), and \( n = 16 \). Is there evidence of a positive association?

\[ t = \frac{4.1 - 0}{1.3} = 3.15, \quad df = 14 \] Using a t-distribution table or calculator, find the p-value and compare it to α.

Conclusion

If the p-value is less than the significance level (e.g., 0.05), reject \( H_0 \). Conclude that there is significant evidence of a linear relationship.

Calculator Tip

- Use LinRegTTest from the STAT → TESTS menu.
- Enter the x- and y-lists, hypothesis for \( \beta \), and press Calculate.
- You’ll get the t-statistic, p-value, and regression info.

Interpreting Computer Output

Regression output usually includes:

Slope (\( b \)): coefficient for the explanatory variable
Standard error (SE): for the slope
t: test statistic for \( H_0: \beta = 0 \)
p-value: used to make decisions about the null hypothesis
\( R^2 \): percent of variation in response variable explained by the regression
s: standard deviation of the residuals

Tip: Always use the row for the explanatory variable (not the constant/intercept row) when conducting inference for slope.

Selecting an Appropriate Inference Procedure

Use regression inference procedures when:

You have two quantitative variables (explanatory and response)
You are using a linear regression model to describe the relationship
You want to estimate or test the slope of the population regression line

How to Choose:

Use a confidence interval if the question asks to estimate the slope.
Use a significance test if the question asks whether there is convincing evidence of a relationship.

Note: Don’t use a chi-square test (used for categorical data) or a proportion test (used for categorical proportions).

Summary Table

Procedure Calculator Formula df
CI for slope LinRegTInt \( b \pm t^* \cdot SE_b \) \( n - 2 \)
Test for slope LinRegTTest \( t = \frac{b - 0}{SE_b} \) \( n - 2 \)

Final Tips

  • Check residual plots to verify LINE conditions.
  • Use the regression output row corresponding to the explanatory variable.
  • Don’t interpret a significant slope as causation unless it’s from a randomized experiment.
  • Always write conclusions in context of the variables and data.