11th Grade · AP Statistics

Unit 7: Inference for Means

Confidence Intervals & Significance Tests · One-Sample & Two-Sample t-Procedures

📊

Unit Overview

Unit 7 covers inference procedures for population means using the t-distribution. You'll learn to construct confidence intervals and conduct hypothesis tests — both for a single population mean and for comparing two population means (either paired or independent samples).

The Four Major Procedures

ProcedureGoalWhen to Use
One-Sample t-IntervalEstimate μOne group; estimate population mean
One-Sample t-TestTest claim about μOne group; test vs. a specific value μ₀
Paired t-IntervalEstimate μ_d (mean difference)Two measurements on same subject or matched pairs
Two-Sample t-Test / CICompare μ₁ and μ₂Two independent groups

Decision Tree: Which Procedure?

How many groups?
One group → One-sample t-procedure  |  Two groups → Continue below
Are subjects paired?
Yes (matched pairs / before & after) → Paired t-procedure (treat differences as one sample)  |  No (independent) → Two-sample t-procedure
CI or test?
Estimating → Confidence Interval  |  Testing a claim → Significance Test
📐

Confidence Intervals for a Population Mean

A confidence interval gives a range of plausible values for a population mean μ. A 95% CI means: if we repeated sampling many times, 95% of the intervals we construct would contain the true μ.

Conditions (RNI)

🎲 Random

Data comes from a random sample or randomized experiment.

🔔 Normal

n ≥ 30, OR population is Normal, OR histogram shows no strong skew/outliers (small n).

🔗 Independent

Individual observations are independent. If sampling without replacement: n ≤ 10% of population.

Formula

One-Sample t-Interval for μx̄ ± t* · (s / √n)    df = n − 1
= sample mean
s = sample standard deviation
n = sample size
t* = critical value from t-table with df = n−1 and desired confidence level
Margin of Error = t* · (s/√n)

Steps

  1. State: Define the parameter. "Let μ = the true mean [context]."
  2. Plan: Name the procedure (one-sample t-interval). Check RNI conditions.
  3. Calculate: Find x̄, s, n. Look up t* (df = n−1). Compute interval: x̄ ± t*(s/√n).
  4. Conclude: "We are [C]% confident the true mean [context] is between [L] and [U]."

Worked Example — Houston Rockets

Example from Unit 7 Review

During the 2016–2017 NBA season, the Houston Rockets averaged 40.317 three-point attempts per game with a standard deviation of 7.078 over 82 games. Construct a 95% confidence interval for the Rockets' true mean three-point attempts per game.

Conditions: Random ✓ (treat games as random sample of performance); Normal ✓ (n = 82 ≥ 30); Independent ✓ (82 games < 10% of all NBA games played).

Calculate: df = 81, t* ≈ 1.990
ME = 1.990 × (7.078/√82) = 1.990 × 0.782 = 1.556
CI = 40.317 ± 1.556 = (38.76, 41.87)

Conclude: We are 95% confident the Rockets' true mean three-point attempts per game during that season was between 38.76 and 41.87.

Key Facts

Wider interval when:
  • Higher confidence level (larger t*)
  • More variability (larger s)
  • Smaller sample size (smaller n)
Narrower interval when:
  • Lower confidence level
  • Less variability
  • Larger sample size
🔬

Significance Tests for a Population Mean

A significance test assesses the evidence against a null hypothesis. We ask: assuming H₀ is true, how likely is our sample result? If the probability (p-value) is small, we have evidence against H₀.

Hypotheses

Null Hypothesis (H₀): μ = μ₀
The population mean equals a specific value. This is what we assume is true.
Alternative Hypothesis (Hₐ):
μ > μ₀  (right-tailed)
μ < μ₀  (left-tailed)
μ ≠ μ₀  (two-tailed)

Formula

One-Sample t-Statistict = (x̄ − μ₀) / (s / √n)    df = n − 1

Steps (SPCC)

  1. State: H₀ and Hₐ in context; significance level α (usually 0.05).
  2. Plan: Name the test (one-sample t-test). Check RNI conditions.
  3. Calculate: Compute t-statistic. Find p-value from t-table (df = n−1).
  4. Conclude: Compare p-value to α. State decision and interpretation in context.

Conclusion Templates

Reject H₀: "Since p = [value] < α = 0.05, we reject H₀. There is convincing evidence that [Hₐ in context]."
Fail to Reject H₀: "Since p = [value] ≥ α = 0.05, we fail to reject H₀. There is not convincing evidence that [Hₐ in context]."

Type I and Type II Errors

H₀ is actually TRUEH₀ is actually FALSE
Reject H₀Type I Error (α)
False positive
Correct ✓ (Power)
Fail to Reject H₀Correct ✓Type II Error (β)
False negative
Power = 1 − β = probability of correctly rejecting a false H₀. Power increases with larger sample size, larger effect size, and higher α.

Worked Example — Houston Rockets

Example from Unit 7 Review

The NBA average was 27 three-point attempts per game. Did the Rockets attempt significantly more? (x̄ = 40.317, s = 7.078, n = 82)

H₀: μ = 27    Hₐ: μ > 27    α = 0.05

Conditions: Random ✓, Normal ✓ (n = 82 ≥ 30), Independent ✓

Calculate: t = (40.317 − 27) / (7.078/√82) = 13.317 / 0.782 ≈ 17.03
df = 81, p-value ≈ 0 (essentially 0)

Conclude: Since p ≈ 0 < 0.05, we reject H₀. There is convincing evidence the Rockets averaged more than 27 three-point attempts per game.

Potential Error: Since we rejected H₀, a Type I error is possible — concluding the Rockets attempted more than average when they actually didn't. Consequence: overestimating the team's three-point strategy impact.

⚖️

Confidence Intervals for a Difference in Population Means

When comparing two groups, we estimate μ₁ − μ₂ (or μ_d for paired data). The key question: Are the groups paired (matched) or independent?

Paired (Matched Pairs):
  • Same subject measured twice (before/after)
  • Subjects deliberately matched (age groups, siblings)
  • Calculate differences d = x₁ − x₂ for each pair
  • Treat differences as one sample
Two Independent Samples:
  • Two separate, unrelated groups
  • Random assignment to two treatments
  • No natural pairing between observations

Paired t-Interval Formula

Paired t-Interval for μ_dd̄ ± t* · (s_d / √n)    df = n − 1    (n = number of pairs)

Two-Sample t-Interval Formula

Two-Sample t-Interval for μ₁ − μ₂(x̄₁ − x̄₂) ± t* · √(s₁²/n₁ + s₂²/n₂)    df ≈ min(n₁−1, n₂−1)

Worked Example — Supermarket Employees (Paired)

Example from Unit 7 Review

n = 34 matched pairs, d̄ = −0.7 years, s_d = 4.82. Construct a 95% CI for the mean difference in experience (M − WF).

df = 33, t* ≈ 2.035
SE = 4.82/√34 = 0.827
CI = −0.7 ± 2.035(0.827) = −0.7 ± 1.683 = (−2.38, 0.98)

Conclude: We are 95% confident the mean difference in experience is between −2.38 and 0.98 years. Since 0 is in the interval, there is no significant difference at the 5% level.

Worked Example — SAT Prep Course (Paired)

Example from Unit 7 Review

12 students. Score differences (after − before): 43, 13, −7, 47, 61, 39, 14, 37, 17, 47, 49, 9.
d̄ = 369/12 = 30.75, s_d ≈ 20.71

95% CI: df = 11, t* ≈ 2.201
SE = 20.71/√12 = 5.98
CI = 30.75 ± 2.201(5.98) = 30.75 ± 13.16 = (17.59, 43.91)

Conclude: We are 95% confident mean improvement is between 17.59 and 43.91 points. Since 30 is inside this interval, we cannot say at 95% confidence that improvement exceeds 30 points.

🧪

Significance Tests for a Difference in Population Means

Paired t-Test

Paired t-Statistict = d̄ / (s_d / √n)    H₀: μ_d = 0    df = n − 1

Two-Sample t-Test

Two-Sample t-Statistict = (x̄₁ − x̄₂) / √(s₁²/n₁ + s₂²/n₂)    H₀: μ₁ = μ₂    df ≈ min(n₁−1, n₂−1)

Conditions for Two-Sample Tests

  • Random: Both samples are random, or subjects were randomly assigned to groups.
  • Normal: Both groups satisfy the Normal condition (n ≥ 30 each, or approximately Normal distribution).
  • Independent: The two groups are independent of each other (this is what distinguishes two-sample from paired). Each sample satisfies the 10% condition.

Worked Example — Cholesterol Drug (Two-Sample)

Example from Unit 7 Review

Standard drug: n₁ = 50, x̄₁ = 10 mg/dl, s₁ = 8  |  New drug: n₂ = 50, x̄₂ = 18 mg/dl, s₂ = 12

H₀: μ_new = μ_std    Hₐ: μ_new > μ_std    α = 0.05

Conditions: Random ✓ (randomly assigned), Normal ✓ (n = 50 ≥ 30 each), Independent ✓ (separate groups).

Calculate:
t = (18 − 10) / √(12²/50 + 8²/50) = 8 / √(144/50 + 64/50) = 8 / √4.16 = 8 / 2.039 ≈ 3.92
df ≈ min(49, 49) = 49, p-value < 0.001

Conclude: Since p < 0.001 < 0.05, we reject H₀. There is convincing evidence the new drug reduces cholesterol more than the standard treatment.

Interpret p-value: If the two drugs were equally effective, there is less than a 0.1% chance of observing a difference this large or larger purely by chance.

Worked Example — Cereal Toy (Paired)

Example from Unit 7 Review

14 age-group pairs. Each pair: one got toy coupon, one didn't. Differences (toy − no toy): 2, 4, 0, −1, −1, 5, 6, 6, 2, 0, 1, 1, 0, 1.
d̄ = 26/14 ≈ 1.857, s_d ≈ 2.445

H₀: μ_d = 0    Hₐ: μ_d > 0    α = 0.05

Calculate: t = 1.857 / (2.445/√14) = 1.857 / 0.653 ≈ 2.84, df = 13, p ≈ 0.007

Conclude: Since p ≈ 0.007 < 0.05, we reject H₀. There is convincing evidence that including a toy coupon increases cereal purchases.

📋

Formula Reference Sheet

ProcedurePoint EstimateStandard ErrorTest Statisticdf
One-Sample t-CI/Tests/√nt = (x̄ − μ₀)/(s/√n)n − 1
Paired t-CI/Tests_d/√nt = d̄/(s_d/√n)n − 1
Two-Sample t-CI/Testx̄₁ − x̄₂√(s₁²/n₁ + s₂²/n₂)t = (x̄₁−x̄₂)/SEmin(n₁−1, n₂−1)
Confidence Interval Structure:
Point Estimate ± t* × Standard Error
t* depends on confidence level and df
Common t* Values:
90% CI: t* ≈ 1.645 (large n)
95% CI: t* ≈ 2.000 (large n)
99% CI: t* ≈ 2.576 (large n)
🃏

Flashcards

Click a card to flip it.   ■ One-Sample CI   ■ One-Sample Test   ■ Two-Sample CI   ■ Two-Sample Test

Click to reveal
Confidence Interval
A range of plausible values for a population parameter. A C% CI means: if we took many samples, C% of the intervals would capture the true parameter.
Click to reveal
t* (Critical Value)
The value from the t-distribution such that C% of the distribution falls between −t* and +t*. Found using df = n−1 and desired confidence level. Larger for higher confidence or fewer df.
Click to reveal
Degrees of Freedom
For a one-sample t-procedure: df = n − 1. Controls the shape of the t-distribution. More df → t-distribution looks more like the Normal distribution.
Click to reveal
Margin of Error
ME = t* × (s/√n). Half the width of the confidence interval. Smaller when n is larger, s is smaller, or confidence level is lower.
Click to reveal
Standard Error
SE = s/√n. Estimates the standard deviation of the sampling distribution of x̄. Measures how much x̄ typically varies from sample to sample.
Click to reveal
Normal Condition
For t-procedures: (1) n ≥ 30, OR (2) population is known to be Normal, OR (3) for n < 30, a histogram/plot shows no strong skew or outliers. CLT ensures Near-Normal sampling distribution when n is large.
Click to reveal
t-Distribution vs. Normal
The t-distribution has heavier tails than Normal (more spread), accounting for uncertainty in estimating σ with s. As df → ∞, t-distribution → standard Normal.
Click to reveal
CI Interpretation Template
"We are [C]% confident that the true mean [context] is between [lower] and [upper] [units]." Must reference the population parameter and context, NOT the sample.
Click to reveal
Null Hypothesis (H₀)
The "no effect / no difference" claim. For one-sample t-test: H₀: μ = μ₀. We assume H₀ is true when computing the p-value. We never "prove" H₀ — we only fail to reject it.
Click to reveal
P-value
The probability of getting a test statistic as extreme as (or more extreme than) the observed value, assuming H₀ is true. Small p-value = strong evidence against H₀. NOT the probability that H₀ is true.
Click to reveal
Type I Error
Rejecting H₀ when it is actually true (false positive). Probability = α (significance level). Example: concluding a drug works when it actually doesn't.
Click to reveal
Type II Error
Failing to reject H₀ when it is actually false (false negative). Probability = β. Example: concluding a drug doesn't work when it actually does.
Click to reveal
Power of a Test
Power = 1 − β = probability of correctly rejecting a false H₀. Increases with: larger n, larger effect size, higher α, less variability.
Click to reveal
Significance Level (α)
The threshold for the p-value at which we reject H₀. Common: α = 0.05. Setting α = 0.01 is more conservative (harder to reject); α = 0.10 is less conservative.
Click to reveal
Conclusion Template (Reject)
"Since p = [value] < α = 0.05, we reject H₀. There is convincing evidence that [Hₐ in context]." Always interpret in context of the problem.
Click to reveal
Conclusion Template (FTR)
"Since p = [value] ≥ α = 0.05, we fail to reject H₀. There is not convincing evidence that [Hₐ in context]." We never say "accept H₀" or "prove H₀."
Click to reveal
Paired vs. Two-Sample
Paired: same subject measured twice, or subjects deliberately matched. Two-sample: two independent, unrelated groups. Using the wrong procedure is a serious error.
Click to reveal
Matched Pairs Design
An experimental design where subjects are paired based on relevant characteristics (age, gender) OR each subject serves as their own control (before/after). Reduces variability and increases power.
Click to reveal
μ_d (Mean Difference)
The population mean of all individual differences (d = x₁ − x₂) for paired data. H₀ in a paired test: μ_d = 0 (no average difference). Estimate with d̄.
Click to reveal
Paired t-Interval
d̄ ± t* · (s_d / √n), where n = number of pairs, df = n − 1. By reducing paired data to differences, this is just a one-sample t-interval on the differences.
Click to reveal
Two-Sample t-Interval
(x̄₁ − x̄₂) ± t* · √(s₁²/n₁ + s₂²/n₂). Estimates μ₁ − μ₂. If 0 is NOT in the CI, the difference is statistically significant at that confidence level.
Click to reveal
CI Contains Zero
If a CI for μ₁ − μ₂ (or μ_d) contains 0, the difference is NOT statistically significant. Zero is a plausible value for the difference, meaning we cannot rule out no difference.
Click to reveal
Advantage of Paired Design
Pairing controls for lurking variables (like age in the cereal study). By blocking on the confounding variable, variability decreases, the CI is narrower, and the test has more power.
Click to reveal
Paired t-Test H₀
H₀: μ_d = 0 (the mean difference is zero — no effect). Hₐ can be: μ_d > 0, μ_d < 0, or μ_d ≠ 0 depending on the research question.
Click to reveal
Two-Sample t-Test H₀
H₀: μ₁ = μ₂ (equivalently, μ₁ − μ₂ = 0). The two populations have equal means. Hₐ: μ₁ > μ₂, μ₁ < μ₂, or μ₁ ≠ μ₂.
Click to reveal
Independence Condition (Two-Sample)
The two samples must be independent of each other — observations in group 1 have no relationship to observations in group 2. If subjects are matched, use paired procedures instead.
Click to reveal
df for Two-Sample t
Conservative: df = min(n₁−1, n₂−1). More precise: Welch-Satterthwaite formula (given by calculator). Using the conservative df gives slightly wider intervals and larger p-values.
Click to reveal
Interpret P-value (Two-Sample)
"If μ₁ = μ₂, there is a [p-value] probability of observing a difference of x̄₁ − x̄₂ = [value] or more extreme purely by chance." Small p → evidence groups differ.
Click to reveal
Effect of Larger n on Power
Larger sample size increases power: (1) SE decreases, making the test stat larger; (2) the CI narrows; (3) real differences are easier to detect. More data = harder for a real effect to hide.
Click to reveal
Statistically Significant ≠ Practically Important
With large n, even tiny differences can be statistically significant (p < 0.05) but may not matter in real life. Always consider the effect size (how big is the actual difference?) alongside the p-value.
Click to reveal
p-value Interpretation Template
"Assuming H₀ is true (μ₁ = μ₂), there is a [p-value] probability of observing a difference as extreme as [observed difference] by chance alone." DO NOT say "probability H₀ is true."
📝

40-Question Practice Quiz

Click "Show Answer" after each question to check yourself.

One-Sample Confidence Intervals (Q1–Q10)

QUESTION 1
Which of the following correctly interprets a 95% confidence interval of (38.76, 41.87) for μ?
  1. There is a 95% probability that μ is between 38.76 and 41.87.
  2. 95% of the sample data falls between 38.76 and 41.87.
  3. We are 95% confident the true population mean is between 38.76 and 41.87.
  4. If we repeat the study, μ will be in this interval 95% of the time.
Show Answer
C. A confidence interval gives a range of plausible values for the population parameter. We are confident the method produces intervals containing μ 95% of the time — but for any one interval, μ either is or isn't in it.
QUESTION 2
A random sample of n = 16 has x̄ = 52 and s = 8. What is the degrees of freedom for a t-interval?
  1. 8
  2. 15
  3. 16
  4. 17
Show Answer
B — 15. df = n − 1 = 16 − 1 = 15 for a one-sample t-procedure.
QUESTION 3
All else equal, increasing the sample size will:
  1. Widen the confidence interval
  2. Narrow the confidence interval
  3. Have no effect on the confidence interval
  4. Change the confidence level
Show Answer
B. Larger n → smaller SE = s/√n → smaller margin of error → narrower interval.
QUESTION 4
Which condition is NOT required for a one-sample t-interval?
  1. The data come from a random sample
  2. The population standard deviation σ is known
  3. The sampling distribution of x̄ is approximately Normal
  4. Observations are independent
Show Answer
B. t-procedures use the sample standard deviation s — we do NOT need to know σ. That's the whole point of using t instead of z.
QUESTION 5
A 99% confidence interval will be _________ a 95% confidence interval, for the same data.
  1. Narrower than
  2. The same width as
  3. Wider than
  4. Cannot be determined
Show Answer
C — Wider than. Higher confidence requires a larger t*, which increases the margin of error.
QUESTION 6
For a one-sample t-interval with n = 25, x̄ = 100, s = 10, what is the standard error?
  1. 10
  2. 4
  3. 2
  4. 0.4
Show Answer
C — 2. SE = s/√n = 10/√25 = 10/5 = 2.
QUESTION 7
A student with n = 10 (skewed right distribution) wants to construct a t-interval. Which condition is most at risk?
  1. Random
  2. Normal
  3. Independent
  4. None — all conditions are met
Show Answer
B — Normal. With n = 10 (small) and a skewed distribution, the sampling distribution of x̄ may not be approximately Normal. We'd need to verify with a graph that there are no strong outliers or skew.
QUESTION 8
A 95% CI for the mean height of students is (64.2, 68.8) inches. What is the sample mean?
  1. 64.2
  2. 68.8
  3. 66.5
  4. 4.6
Show Answer
C — 66.5. x̄ = (lower + upper)/2 = (64.2 + 68.8)/2 = 133/2 = 66.5.
QUESTION 9
From the same CI (64.2, 68.8), what is the margin of error?
  1. 4.6
  2. 2.3
  3. 66.5
  4. 1.15
Show Answer
B — 2.3. ME = (upper − lower)/2 = (68.8 − 64.2)/2 = 4.6/2 = 2.3.
QUESTION 10
The Houston Rockets' 95% CI for mean 3-point attempts is (38.76, 41.87). Based on this, can we conclude the true mean differs from 40?
  1. Yes, because 40 is not the midpoint
  2. No, because 40 falls inside the interval
  3. Yes, because the interval is entirely above 27
  4. No, because we need a hypothesis test for that
Show Answer
B. Since 40 is inside the CI, it is a plausible value for μ. We cannot conclude the mean differs from 40.

One-Sample Significance Tests (Q11–Q20)

QUESTION 11
A researcher tests H₀: μ = 50 vs. Hₐ: μ > 50. The p-value is 0.03. At α = 0.05, the conclusion is:
  1. Fail to reject H₀; no evidence μ > 50
  2. Reject H₀; convincing evidence μ > 50
  3. Accept H₀; μ = 50
  4. The test is inconclusive
Show Answer
B. p = 0.03 < α = 0.05, so we reject H₀. There is convincing evidence that μ > 50.
QUESTION 12
For a two-sided test (Hₐ: μ ≠ μ₀), the p-value is found by:
  1. The area in one tail beyond the test statistic
  2. Doubling the area in one tail beyond |t|
  3. Subtracting the t-statistic from 1
  4. Using the area between −t and +t
Show Answer
B. For a two-sided test, p-value = 2 × P(T > |t|), accounting for both tails.
QUESTION 13
A Type I error in the Houston Rockets example would be:
  1. Concluding the Rockets averaged ≤ 27 attempts when they actually averaged more
  2. Concluding the Rockets averaged > 27 attempts when they actually didn't
  3. Using the wrong significance level
  4. Using a two-sided test instead of one-sided
Show Answer
B. Type I error = rejecting H₀ when H₀ is true = concluding μ > 27 when actually μ = 27 (false positive).
QUESTION 14
A test statistic of t = −2.5 with df = 20 (one-sided left test). The p-value is:
  1. Greater than 0.05
  2. About 0.5
  3. Less than 0.05
  4. Equal to 0.025
Show Answer
C. For df = 20, the critical value at α = 0.05 is about −1.725. Since −2.5 < −1.725, the p-value < 0.05.
QUESTION 15
Which of the following statements about the p-value is TRUE?
  1. A small p-value proves H₀ is false
  2. The p-value is the probability H₀ is true
  3. A large p-value proves H₀ is true
  4. The p-value is calculated assuming H₀ is true
Show Answer
D. The p-value is computed assuming H₀ is true. It measures how unlikely the observed result is if H₀ were true — but it does not prove or disprove H₀.
QUESTION 16
Decreasing α from 0.05 to 0.01 will:
  1. Increase the chance of a Type I error
  2. Decrease the chance of a Type I error but increase the chance of a Type II error
  3. Decrease both Type I and Type II errors
  4. Increase the power of the test
Show Answer
B. Lower α → harder to reject H₀ → fewer false positives (Type I ↓) but more false negatives (Type II ↑, power ↓).
QUESTION 17
For the Rockets test (t = 17.03, df = 81, one-sided), which statement is most accurate?
  1. The result is marginally significant
  2. The result is extremely statistically significant with p ≈ 0
  3. We fail to reject H₀ because the sample size is too large
  4. The result would only be significant with a two-sided test
Show Answer
B. A t-statistic of 17.03 is enormous. The p-value is essentially 0, providing overwhelming evidence against H₀.
QUESTION 18
Which phrasing correctly concludes a significance test when we fail to reject H₀?
  1. "We accept H₀; the null hypothesis is true."
  2. "We prove there is no effect."
  3. "We fail to reject H₀; there is not convincing evidence of [Hₐ]."
  4. "The data supports H₀ with 95% confidence."
Show Answer
C. We never "accept" or "prove" H₀. "Fail to reject" means only that evidence against H₀ was insufficient, not that H₀ is true.
QUESTION 19
A Type II error in a drug study would mean:
  1. Approving an ineffective drug
  2. Rejecting an effective drug
  3. Using the wrong sample size
  4. Misinterpreting the confidence interval
Show Answer
B. Type II error = failing to reject H₀ when it is false = concluding the drug doesn't work when it actually does (false negative). Very costly in medicine.
QUESTION 20
To increase the power of a one-sample t-test, a researcher should:
  1. Decrease the sample size
  2. Use a smaller significance level (α)
  3. Increase the sample size
  4. Use a two-sided test instead of one-sided
Show Answer
C. Larger n → smaller SE → larger test statistic for the same true effect → higher power (more likely to detect a real effect).

Two-Sample & Paired Confidence Intervals (Q21–Q30)

QUESTION 21
In the cereal toy study, subjects were matched by age group. The appropriate procedure is:
  1. Two-sample t-test, because there are two groups
  2. Paired t-procedure, because subjects are matched by age
  3. One-sample t-test, because each age group is one unit
  4. Chi-square test, because the data are counts
Show Answer
B. The design is matched pairs (one from each age group, one gets toy coupon). The appropriate procedure is the paired t-procedure.
QUESTION 22
For the supermarket study (paired, n=34, d̄=−0.7, s_d=4.82), the 95% CI is (−2.38, 0.98). What does this tell us?
  1. There is a significant difference in experience at the 5% level
  2. There is no significant difference because 0 is in the interval
  3. WF employees have significantly more experience
  4. M employees have significantly more experience
Show Answer
B. Since 0 is inside the interval (−2.38, 0.98), a difference of 0 is plausible. We cannot conclude there is a significant difference.
QUESTION 23
For paired data with n = 20 pairs, the degrees of freedom are:
  1. 38
  2. 19
  3. 20
  4. 18
Show Answer
B — 19. Paired data is treated as a one-sample procedure on the differences. df = n − 1 = 20 − 1 = 19.
QUESTION 24
Two independent groups: n₁ = 30, n₂ = 25. Using the conservative df for a two-sample t-procedure:
  1. df = 53
  2. df = 55
  3. df = 24
  4. df = 29
Show Answer
C — 24. Conservative df = min(n₁−1, n₂−1) = min(29, 24) = 24.
QUESTION 25
A 95% CI for μ₁ − μ₂ is (2.1, 8.3). We can conclude:
  1. μ₁ = μ₂ is plausible
  2. μ₁ is significantly greater than μ₂ at the 5% level
  3. μ₂ is significantly greater than μ₁
  4. The difference is not significant
Show Answer
B. Since the entire interval is above 0, both bounds are positive. Zero is not plausible, so μ₁ > μ₂ at the 5% significance level.
QUESTION 26
Why did the cereal study match subjects by age group rather than use two independent groups?
  1. To make the study cheaper
  2. Age is unrelated to cereal purchases
  3. To control for age as a confounding variable, reducing variability
  4. The study required an even number of subjects
Show Answer
C. Age likely affects how much cereal people buy. Matching by age removes this source of variability, making the test more sensitive to the actual toy effect.
QUESTION 27
The SAT prep course 95% CI for mean improvement is (17.59, 43.91). Does this support the claim of >30 point improvement?
  1. Yes, because the mean is 30.75 > 30
  2. Yes, because the lower bound (17.59) is positive
  3. No, because 30 is inside the interval
  4. No, because the upper bound exceeds 30
Show Answer
C. Since 30 is inside the CI, a true mean improvement of 30 is plausible. We cannot rule out that μ_d = 30, so we have no evidence the improvement exceeds 30 at the 95% level.
QUESTION 28
In a two-sample t-interval, the standard error is √(s₁²/n₁ + s₂²/n₂). This formula comes from:
  1. Adding the two sample standard deviations
  2. The variance of a difference of independent random variables
  3. The pooled standard deviation formula
  4. The Central Limit Theorem only
Show Answer
B. Var(X̄₁ − X̄₂) = Var(X̄₁) + Var(X̄₂) = σ₁²/n₁ + σ₂²/n₂ (since samples are independent). We estimate σ with s.
QUESTION 29
Which scenario requires a paired t-procedure (not two-sample)?
  1. Comparing cholesterol levels of 50 patients on Drug A vs. 50 different patients on Drug B
  2. Comparing SAT scores before and after a prep course for the same students
  3. Comparing average salaries at two different companies
  4. Comparing heights of students at two different schools
Show Answer
B. Before-and-after measurements on the same subjects = paired data. The other scenarios involve independent groups.
QUESTION 30
A paired CI is preferred over a two-sample CI when:
  1. The sample sizes are large
  2. There is a natural pairing that reduces variability between pairs
  3. The two groups have equal standard deviations
  4. The researcher prefers a wider interval
Show Answer
B. Pairing is beneficial when the pairing variable (like age) is associated with the response. This reduces within-pair variability and yields a more precise estimate.

Two-Sample & Paired Significance Tests (Q31–Q40)

QUESTION 31
In the cholesterol study, the null hypothesis H₀: μ_new = μ_std means:
  1. The new drug increases cholesterol more
  2. Both drugs reduce cholesterol by the same average amount
  3. The new drug is superior
  4. Cholesterol levels are the same before treatment
Show Answer
B. H₀ states there is no difference in mean cholesterol reduction between the two drugs — both produce the same average effect.
QUESTION 32
The cholesterol study t-statistic is t ≈ 3.92 with df = 49 (one-sided). The appropriate conclusion (α = 0.05) is:
  1. Fail to reject H₀; no evidence the new drug is better
  2. Reject H₀; convincing evidence the new drug reduces cholesterol more
  3. Accept H₀; the drugs are equally effective
  4. The test is inconclusive without more data
Show Answer
B. t = 3.92 gives p < 0.001 < 0.05. We reject H₀ and conclude the new drug is significantly more effective.
QUESTION 33
In the cereal toy paired test (d̄ = 1.857, s_d = 2.445, n = 14), what is the test statistic?
  1. t = 0.76
  2. t = 1.86
  3. t = 2.84
  4. t = 4.12
Show Answer
C — t ≈ 2.84. t = d̄ / (s_d/√n) = 1.857 / (2.445/√14) = 1.857/0.653 ≈ 2.84.
QUESTION 34
The cereal test p-value is 0.007. How do you interpret this?
  1. There is a 0.7% chance people buy more cereal with a toy coupon
  2. If the toy has no effect, there is a 0.7% chance of seeing a mean difference this large or larger by chance
  3. The toy coupon increases sales by 0.7%
  4. H₀ is true with 99.3% probability
Show Answer
B. The p-value is the probability of the observed result (or more extreme) assuming H₀ is true (no effect). It is NOT the probability H₀ is true.
QUESTION 35
Why can't you use a two-sample t-test for the SAT prep data?
  1. The sample size is too small
  2. The scores are not Normal
  3. The "before" and "after" scores come from the same students — they're paired, not independent
  4. SAT scores can't be used in t-tests
Show Answer
C. The same students were measured before and after. The scores are not independent — the paired t-test is required to properly account for the within-student correlation.
QUESTION 36
If the two-sample t-test for cholesterol gave p = 0.0002, what is the correct interpretation?
  1. The new drug works 99.98% of the time
  2. If both drugs were equally effective, there is a 0.02% chance of observing this large a difference by chance
  3. H₀ is false with probability 0.9998
  4. The new drug reduces cholesterol by 0.02% more
Show Answer
B. p-value interpretation: assuming H₀ is true (equal means), the probability of observing a difference this extreme is 0.0002 — extremely unlikely, strong evidence against H₀.
QUESTION 37
A Type I error in the cereal toy study would mean:
  1. Concluding the toy increases sales when it actually doesn't
  2. Concluding the toy has no effect when it actually does
  3. Using paired instead of two-sample procedure
  4. Having too few subjects in the study
Show Answer
A. Type I error = rejecting H₀ when it is true = concluding the toy increases sales (Hₐ) when in reality μ_d = 0 (H₀ true).
QUESTION 38
For a two-sample test with n₁ = n₂ = 50, which change would most increase power?
  1. Decrease α from 0.05 to 0.01
  2. Use a two-sided instead of one-sided test
  3. Increase each sample to n = 100
  4. Use the conservative df instead of Welch df
Show Answer
C. Doubling the sample size reduces the SE, making the test statistic larger and power higher. The other options either decrease power or have minimal effect.
QUESTION 39
In a two-sample t-test, the Random condition requires:
  1. Only one sample to be randomly selected
  2. Both samples to be random, or subjects randomly assigned to treatments
  3. The samples to be the same size
  4. The populations to be Normally distributed
Show Answer
B. Both groups must come from random samples or a randomized experiment. Random assignment (as in the cholesterol study) satisfies this condition.
QUESTION 40
A researcher finds p = 0.048 with a very large sample (n = 10,000). The best conclusion is:
  1. The effect is large and practically important since p < 0.05
  2. The result is statistically significant but the effect size may be trivially small
  3. The result is not significant at α = 0.05
  4. A larger sample is needed to confirm
Show Answer
B. With huge samples, even tiny differences become statistically significant. Always consider effect size alongside p-value. Statistical significance ≠ practical importance.

Rapid Review Sheet

All t-Interval Formulas

  • One-sample: x̄ ± t*(s/√n)
  • Paired: d̄ ± t*(s_d/√n)
  • Two-sample: (x̄₁−x̄₂) ± t*√(s₁²/n₁+s₂²/n₂)
  • df (one/paired): n − 1
  • df (two-sample): min(n₁−1, n₂−1)

All t-Test Statistics

  • One-sample: t = (x̄ − μ₀)/(s/√n)
  • Paired: t = d̄/(s_d/√n)
  • Two-sample: t = (x̄₁−x̄₂)/√(s₁²/n₁+s₂²/n₂)
  • One-sample H₀: μ = μ₀
  • Paired H₀: μ_d = 0
  • Two-sample H₀: μ₁ = μ₂

RNI Conditions Checklist

  • Random sample or random assignment
  • Normal: n ≥ 30, or pop. Normal, or no strong skew/outliers
  • Independent: obs. independent; n ≤ 10% of pop.
  • ✓ Two-sample: also need groups independent of each other

Conclusion Templates

  • Reject H₀: "Since p < α, we reject H₀. There is convincing evidence that [Hₐ in context]."
  • FTR H₀: "Since p ≥ α, we fail to reject H₀. There is not convincing evidence that [Hₐ in context]."
  • CI: "We are [C]% confident the true mean [context] is between [L] and [U]."

Error Types

  • Type I: Reject H₀ when true (false +). Prob = α.
  • Type II: FTR H₀ when false (false −). Prob = β.
  • Power: = 1 − β. Probability of correctly rejecting false H₀.
  • n↑ → Power↑, β↓, interval width↓
  • α↓ → Type I↓ but Type II↑ (power↓)

Paired vs. Two-Sample

  • Paired: Same subject twice, OR subjects matched. Compute differences d = x₁ − x₂.
  • Two-sample: Two unrelated, independent groups.
  • Key signal: "Before/after," "matched by [variable]" → paired
  • CI contains 0 → not significant at that level
  • CI entirely above 0 → group 1 significantly higher
🎬

Recommended Videos

← MJ's Study Guides