A confidence interval gives a range of plausible values for a population mean μ. A 95% CI means: if we repeated sampling many times, 95% of the intervals we construct would contain the true μ.
Conditions (RNI)
🎲 Random
Data comes from a random sample or randomized experiment.
🔔 Normal
n ≥ 30, OR population is Normal, OR histogram shows no strong skew/outliers (small n).
🔗 Independent
Individual observations are independent. If sampling without replacement: n ≤ 10% of population.
Formula
One-Sample t-Interval for μx̄ ± t* · (s / √n) df = n − 1
x̄ = sample mean
s = sample standard deviation
n = sample size
t* = critical value from t-table with df = n−1 and desired confidence level
Margin of Error = t* · (s/√n)
Steps
- State: Define the parameter. "Let μ = the true mean [context]."
- Plan: Name the procedure (one-sample t-interval). Check RNI conditions.
- Calculate: Find x̄, s, n. Look up t* (df = n−1). Compute interval: x̄ ± t*(s/√n).
- Conclude: "We are [C]% confident the true mean [context] is between [L] and [U]."
Worked Example — Houston Rockets
Example from Unit 7 Review
During the 2016–2017 NBA season, the Houston Rockets averaged 40.317 three-point attempts per game with a standard deviation of 7.078 over 82 games. Construct a 95% confidence interval for the Rockets' true mean three-point attempts per game.
Conditions: Random ✓ (treat games as random sample of performance); Normal ✓ (n = 82 ≥ 30); Independent ✓ (82 games < 10% of all NBA games played).
Calculate: df = 81, t* ≈ 1.990
ME = 1.990 × (7.078/√82) = 1.990 × 0.782 = 1.556
CI = 40.317 ± 1.556 = (38.76, 41.87)
Conclude: We are 95% confident the Rockets' true mean three-point attempts per game during that season was between 38.76 and 41.87.
Key Facts
Wider interval when:
- Higher confidence level (larger t*)
- More variability (larger s)
- Smaller sample size (smaller n)
Narrower interval when:
- Lower confidence level
- Less variability
- Larger sample size
A significance test assesses the evidence against a null hypothesis. We ask: assuming H₀ is true, how likely is our sample result? If the probability (p-value) is small, we have evidence against H₀.
Hypotheses
Null Hypothesis (H₀): μ = μ₀
The population mean equals a specific value. This is what we assume is true.
Alternative Hypothesis (Hₐ):
μ > μ₀ (right-tailed)
μ < μ₀ (left-tailed)
μ ≠ μ₀ (two-tailed)
Formula
One-Sample t-Statistict = (x̄ − μ₀) / (s / √n) df = n − 1
Steps (SPCC)
- State: H₀ and Hₐ in context; significance level α (usually 0.05).
- Plan: Name the test (one-sample t-test). Check RNI conditions.
- Calculate: Compute t-statistic. Find p-value from t-table (df = n−1).
- Conclude: Compare p-value to α. State decision and interpretation in context.
Conclusion Templates
Reject H₀: "Since p = [value] < α = 0.05, we reject H₀. There is convincing evidence that [Hₐ in context]."
Fail to Reject H₀: "Since p = [value] ≥ α = 0.05, we fail to reject H₀. There is not convincing evidence that [Hₐ in context]."
Type I and Type II Errors
| H₀ is actually TRUE | H₀ is actually FALSE |
| Reject H₀ | Type I Error (α) False positive | Correct ✓ (Power) |
| Fail to Reject H₀ | Correct ✓ | Type II Error (β) False negative |
Power = 1 − β = probability of correctly rejecting a false H₀. Power increases with larger sample size, larger effect size, and higher α.
Worked Example — Houston Rockets
Example from Unit 7 Review
The NBA average was 27 three-point attempts per game. Did the Rockets attempt significantly more? (x̄ = 40.317, s = 7.078, n = 82)
H₀: μ = 27 Hₐ: μ > 27 α = 0.05
Conditions: Random ✓, Normal ✓ (n = 82 ≥ 30), Independent ✓
Calculate: t = (40.317 − 27) / (7.078/√82) = 13.317 / 0.782 ≈ 17.03
df = 81, p-value ≈ 0 (essentially 0)
Conclude: Since p ≈ 0 < 0.05, we reject H₀. There is convincing evidence the Rockets averaged more than 27 three-point attempts per game.
Potential Error: Since we rejected H₀, a Type I error is possible — concluding the Rockets attempted more than average when they actually didn't. Consequence: overestimating the team's three-point strategy impact.
When comparing two groups, we estimate μ₁ − μ₂ (or μ_d for paired data). The key question: Are the groups paired (matched) or independent?
Paired (Matched Pairs):
- Same subject measured twice (before/after)
- Subjects deliberately matched (age groups, siblings)
- Calculate differences d = x₁ − x₂ for each pair
- Treat differences as one sample
Two Independent Samples:
- Two separate, unrelated groups
- Random assignment to two treatments
- No natural pairing between observations
Paired t-Interval Formula
Paired t-Interval for μ_dd̄ ± t* · (s_d / √n) df = n − 1 (n = number of pairs)
Two-Sample t-Interval Formula
Two-Sample t-Interval for μ₁ − μ₂(x̄₁ − x̄₂) ± t* · √(s₁²/n₁ + s₂²/n₂) df ≈ min(n₁−1, n₂−1)
Worked Example — Supermarket Employees (Paired)
Example from Unit 7 Review
n = 34 matched pairs, d̄ = −0.7 years, s_d = 4.82. Construct a 95% CI for the mean difference in experience (M − WF).
df = 33, t* ≈ 2.035
SE = 4.82/√34 = 0.827
CI = −0.7 ± 2.035(0.827) = −0.7 ± 1.683 = (−2.38, 0.98)
Conclude: We are 95% confident the mean difference in experience is between −2.38 and 0.98 years. Since 0 is in the interval, there is no significant difference at the 5% level.
Worked Example — SAT Prep Course (Paired)
Example from Unit 7 Review
12 students. Score differences (after − before): 43, 13, −7, 47, 61, 39, 14, 37, 17, 47, 49, 9.
d̄ = 369/12 = 30.75, s_d ≈ 20.71
95% CI: df = 11, t* ≈ 2.201
SE = 20.71/√12 = 5.98
CI = 30.75 ± 2.201(5.98) = 30.75 ± 13.16 = (17.59, 43.91)
Conclude: We are 95% confident mean improvement is between 17.59 and 43.91 points. Since 30 is inside this interval, we cannot say at 95% confidence that improvement exceeds 30 points.
Paired t-Test
Paired t-Statistict = d̄ / (s_d / √n) H₀: μ_d = 0 df = n − 1
Two-Sample t-Test
Two-Sample t-Statistict = (x̄₁ − x̄₂) / √(s₁²/n₁ + s₂²/n₂) H₀: μ₁ = μ₂ df ≈ min(n₁−1, n₂−1)
Conditions for Two-Sample Tests
- Random: Both samples are random, or subjects were randomly assigned to groups.
- Normal: Both groups satisfy the Normal condition (n ≥ 30 each, or approximately Normal distribution).
- Independent: The two groups are independent of each other (this is what distinguishes two-sample from paired). Each sample satisfies the 10% condition.
Worked Example — Cholesterol Drug (Two-Sample)
Example from Unit 7 Review
Standard drug: n₁ = 50, x̄₁ = 10 mg/dl, s₁ = 8 | New drug: n₂ = 50, x̄₂ = 18 mg/dl, s₂ = 12
H₀: μ_new = μ_std Hₐ: μ_new > μ_std α = 0.05
Conditions: Random ✓ (randomly assigned), Normal ✓ (n = 50 ≥ 30 each), Independent ✓ (separate groups).
Calculate:
t = (18 − 10) / √(12²/50 + 8²/50) = 8 / √(144/50 + 64/50) = 8 / √4.16 = 8 / 2.039 ≈ 3.92
df ≈ min(49, 49) = 49, p-value < 0.001
Conclude: Since p < 0.001 < 0.05, we reject H₀. There is convincing evidence the new drug reduces cholesterol more than the standard treatment.
Interpret p-value: If the two drugs were equally effective, there is less than a 0.1% chance of observing a difference this large or larger purely by chance.
Worked Example — Cereal Toy (Paired)
Example from Unit 7 Review
14 age-group pairs. Each pair: one got toy coupon, one didn't. Differences (toy − no toy): 2, 4, 0, −1, −1, 5, 6, 6, 2, 0, 1, 1, 0, 1.
d̄ = 26/14 ≈ 1.857, s_d ≈ 2.445
H₀: μ_d = 0 Hₐ: μ_d > 0 α = 0.05
Calculate: t = 1.857 / (2.445/√14) = 1.857 / 0.653 ≈ 2.84, df = 13, p ≈ 0.007
Conclude: Since p ≈ 0.007 < 0.05, we reject H₀. There is convincing evidence that including a toy coupon increases cereal purchases.
Click "Show Answer" after each question to check yourself.
One-Sample Confidence Intervals (Q1–Q10)
QUESTION 1
Which of the following correctly interprets a 95% confidence interval of (38.76, 41.87) for μ?
- There is a 95% probability that μ is between 38.76 and 41.87.
- 95% of the sample data falls between 38.76 and 41.87.
- We are 95% confident the true population mean is between 38.76 and 41.87.
- If we repeat the study, μ will be in this interval 95% of the time.
Show Answer
C. A confidence interval gives a range of plausible values for the population parameter. We are confident the method produces intervals containing μ 95% of the time — but for any one interval, μ either is or isn't in it.
QUESTION 2
A random sample of n = 16 has x̄ = 52 and s = 8. What is the degrees of freedom for a t-interval?
- 8
- 15
- 16
- 17
Show Answer
B — 15. df = n − 1 = 16 − 1 = 15 for a one-sample t-procedure.
QUESTION 3
All else equal, increasing the sample size will:
- Widen the confidence interval
- Narrow the confidence interval
- Have no effect on the confidence interval
- Change the confidence level
Show Answer
B. Larger n → smaller SE = s/√n → smaller margin of error → narrower interval.
QUESTION 4
Which condition is NOT required for a one-sample t-interval?
- The data come from a random sample
- The population standard deviation σ is known
- The sampling distribution of x̄ is approximately Normal
- Observations are independent
Show Answer
B. t-procedures use the sample standard deviation s — we do NOT need to know σ. That's the whole point of using t instead of z.
QUESTION 5
A 99% confidence interval will be _________ a 95% confidence interval, for the same data.
- Narrower than
- The same width as
- Wider than
- Cannot be determined
Show Answer
C — Wider than. Higher confidence requires a larger t*, which increases the margin of error.
QUESTION 6
For a one-sample t-interval with n = 25, x̄ = 100, s = 10, what is the standard error?
- 10
- 4
- 2
- 0.4
Show Answer
C — 2. SE = s/√n = 10/√25 = 10/5 = 2.
QUESTION 7
A student with n = 10 (skewed right distribution) wants to construct a t-interval. Which condition is most at risk?
- Random
- Normal
- Independent
- None — all conditions are met
Show Answer
B — Normal. With n = 10 (small) and a skewed distribution, the sampling distribution of x̄ may not be approximately Normal. We'd need to verify with a graph that there are no strong outliers or skew.
QUESTION 8
A 95% CI for the mean height of students is (64.2, 68.8) inches. What is the sample mean?
- 64.2
- 68.8
- 66.5
- 4.6
Show Answer
C — 66.5. x̄ = (lower + upper)/2 = (64.2 + 68.8)/2 = 133/2 = 66.5.
QUESTION 9
From the same CI (64.2, 68.8), what is the margin of error?
- 4.6
- 2.3
- 66.5
- 1.15
Show Answer
B — 2.3. ME = (upper − lower)/2 = (68.8 − 64.2)/2 = 4.6/2 = 2.3.
QUESTION 10
The Houston Rockets' 95% CI for mean 3-point attempts is (38.76, 41.87). Based on this, can we conclude the true mean differs from 40?
- Yes, because 40 is not the midpoint
- No, because 40 falls inside the interval
- Yes, because the interval is entirely above 27
- No, because we need a hypothesis test for that
Show Answer
B. Since 40 is inside the CI, it is a plausible value for μ. We cannot conclude the mean differs from 40.
One-Sample Significance Tests (Q11–Q20)
QUESTION 11
A researcher tests H₀: μ = 50 vs. Hₐ: μ > 50. The p-value is 0.03. At α = 0.05, the conclusion is:
- Fail to reject H₀; no evidence μ > 50
- Reject H₀; convincing evidence μ > 50
- Accept H₀; μ = 50
- The test is inconclusive
Show Answer
B. p = 0.03 < α = 0.05, so we reject H₀. There is convincing evidence that μ > 50.
QUESTION 12
For a two-sided test (Hₐ: μ ≠ μ₀), the p-value is found by:
- The area in one tail beyond the test statistic
- Doubling the area in one tail beyond |t|
- Subtracting the t-statistic from 1
- Using the area between −t and +t
Show Answer
B. For a two-sided test, p-value = 2 × P(T > |t|), accounting for both tails.
QUESTION 13
A Type I error in the Houston Rockets example would be:
- Concluding the Rockets averaged ≤ 27 attempts when they actually averaged more
- Concluding the Rockets averaged > 27 attempts when they actually didn't
- Using the wrong significance level
- Using a two-sided test instead of one-sided
Show Answer
B. Type I error = rejecting H₀ when H₀ is true = concluding μ > 27 when actually μ = 27 (false positive).
QUESTION 14
A test statistic of t = −2.5 with df = 20 (one-sided left test). The p-value is:
- Greater than 0.05
- About 0.5
- Less than 0.05
- Equal to 0.025
Show Answer
C. For df = 20, the critical value at α = 0.05 is about −1.725. Since −2.5 < −1.725, the p-value < 0.05.
QUESTION 15
Which of the following statements about the p-value is TRUE?
- A small p-value proves H₀ is false
- The p-value is the probability H₀ is true
- A large p-value proves H₀ is true
- The p-value is calculated assuming H₀ is true
Show Answer
D. The p-value is computed assuming H₀ is true. It measures how unlikely the observed result is if H₀ were true — but it does not prove or disprove H₀.
QUESTION 16
Decreasing α from 0.05 to 0.01 will:
- Increase the chance of a Type I error
- Decrease the chance of a Type I error but increase the chance of a Type II error
- Decrease both Type I and Type II errors
- Increase the power of the test
Show Answer
B. Lower α → harder to reject H₀ → fewer false positives (Type I ↓) but more false negatives (Type II ↑, power ↓).
QUESTION 17
For the Rockets test (t = 17.03, df = 81, one-sided), which statement is most accurate?
- The result is marginally significant
- The result is extremely statistically significant with p ≈ 0
- We fail to reject H₀ because the sample size is too large
- The result would only be significant with a two-sided test
Show Answer
B. A t-statistic of 17.03 is enormous. The p-value is essentially 0, providing overwhelming evidence against H₀.
QUESTION 18
Which phrasing correctly concludes a significance test when we fail to reject H₀?
- "We accept H₀; the null hypothesis is true."
- "We prove there is no effect."
- "We fail to reject H₀; there is not convincing evidence of [Hₐ]."
- "The data supports H₀ with 95% confidence."
Show Answer
C. We never "accept" or "prove" H₀. "Fail to reject" means only that evidence against H₀ was insufficient, not that H₀ is true.
QUESTION 19
A Type II error in a drug study would mean:
- Approving an ineffective drug
- Rejecting an effective drug
- Using the wrong sample size
- Misinterpreting the confidence interval
Show Answer
B. Type II error = failing to reject H₀ when it is false = concluding the drug doesn't work when it actually does (false negative). Very costly in medicine.
QUESTION 20
To increase the power of a one-sample t-test, a researcher should:
- Decrease the sample size
- Use a smaller significance level (α)
- Increase the sample size
- Use a two-sided test instead of one-sided
Show Answer
C. Larger n → smaller SE → larger test statistic for the same true effect → higher power (more likely to detect a real effect).
Two-Sample & Paired Confidence Intervals (Q21–Q30)
QUESTION 21
In the cereal toy study, subjects were matched by age group. The appropriate procedure is:
- Two-sample t-test, because there are two groups
- Paired t-procedure, because subjects are matched by age
- One-sample t-test, because each age group is one unit
- Chi-square test, because the data are counts
Show Answer
B. The design is matched pairs (one from each age group, one gets toy coupon). The appropriate procedure is the paired t-procedure.
QUESTION 22
For the supermarket study (paired, n=34, d̄=−0.7, s_d=4.82), the 95% CI is (−2.38, 0.98). What does this tell us?
- There is a significant difference in experience at the 5% level
- There is no significant difference because 0 is in the interval
- WF employees have significantly more experience
- M employees have significantly more experience
Show Answer
B. Since 0 is inside the interval (−2.38, 0.98), a difference of 0 is plausible. We cannot conclude there is a significant difference.
QUESTION 23
For paired data with n = 20 pairs, the degrees of freedom are:
- 38
- 19
- 20
- 18
Show Answer
B — 19. Paired data is treated as a one-sample procedure on the differences. df = n − 1 = 20 − 1 = 19.
QUESTION 24
Two independent groups: n₁ = 30, n₂ = 25. Using the conservative df for a two-sample t-procedure:
- df = 53
- df = 55
- df = 24
- df = 29
Show Answer
C — 24. Conservative df = min(n₁−1, n₂−1) = min(29, 24) = 24.
QUESTION 25
A 95% CI for μ₁ − μ₂ is (2.1, 8.3). We can conclude:
- μ₁ = μ₂ is plausible
- μ₁ is significantly greater than μ₂ at the 5% level
- μ₂ is significantly greater than μ₁
- The difference is not significant
Show Answer
B. Since the entire interval is above 0, both bounds are positive. Zero is not plausible, so μ₁ > μ₂ at the 5% significance level.
QUESTION 26
Why did the cereal study match subjects by age group rather than use two independent groups?
- To make the study cheaper
- Age is unrelated to cereal purchases
- To control for age as a confounding variable, reducing variability
- The study required an even number of subjects
Show Answer
C. Age likely affects how much cereal people buy. Matching by age removes this source of variability, making the test more sensitive to the actual toy effect.
QUESTION 27
The SAT prep course 95% CI for mean improvement is (17.59, 43.91). Does this support the claim of >30 point improvement?
- Yes, because the mean is 30.75 > 30
- Yes, because the lower bound (17.59) is positive
- No, because 30 is inside the interval
- No, because the upper bound exceeds 30
Show Answer
C. Since 30 is inside the CI, a true mean improvement of 30 is plausible. We cannot rule out that μ_d = 30, so we have no evidence the improvement exceeds 30 at the 95% level.
QUESTION 28
In a two-sample t-interval, the standard error is √(s₁²/n₁ + s₂²/n₂). This formula comes from:
- Adding the two sample standard deviations
- The variance of a difference of independent random variables
- The pooled standard deviation formula
- The Central Limit Theorem only
Show Answer
B. Var(X̄₁ − X̄₂) = Var(X̄₁) + Var(X̄₂) = σ₁²/n₁ + σ₂²/n₂ (since samples are independent). We estimate σ with s.
QUESTION 29
Which scenario requires a paired t-procedure (not two-sample)?
- Comparing cholesterol levels of 50 patients on Drug A vs. 50 different patients on Drug B
- Comparing SAT scores before and after a prep course for the same students
- Comparing average salaries at two different companies
- Comparing heights of students at two different schools
Show Answer
B. Before-and-after measurements on the same subjects = paired data. The other scenarios involve independent groups.
QUESTION 30
A paired CI is preferred over a two-sample CI when:
- The sample sizes are large
- There is a natural pairing that reduces variability between pairs
- The two groups have equal standard deviations
- The researcher prefers a wider interval
Show Answer
B. Pairing is beneficial when the pairing variable (like age) is associated with the response. This reduces within-pair variability and yields a more precise estimate.
Two-Sample & Paired Significance Tests (Q31–Q40)
QUESTION 31
In the cholesterol study, the null hypothesis H₀: μ_new = μ_std means:
- The new drug increases cholesterol more
- Both drugs reduce cholesterol by the same average amount
- The new drug is superior
- Cholesterol levels are the same before treatment
Show Answer
B. H₀ states there is no difference in mean cholesterol reduction between the two drugs — both produce the same average effect.
QUESTION 32
The cholesterol study t-statistic is t ≈ 3.92 with df = 49 (one-sided). The appropriate conclusion (α = 0.05) is:
- Fail to reject H₀; no evidence the new drug is better
- Reject H₀; convincing evidence the new drug reduces cholesterol more
- Accept H₀; the drugs are equally effective
- The test is inconclusive without more data
Show Answer
B. t = 3.92 gives p < 0.001 < 0.05. We reject H₀ and conclude the new drug is significantly more effective.
QUESTION 33
In the cereal toy paired test (d̄ = 1.857, s_d = 2.445, n = 14), what is the test statistic?
- t = 0.76
- t = 1.86
- t = 2.84
- t = 4.12
Show Answer
C — t ≈ 2.84. t = d̄ / (s_d/√n) = 1.857 / (2.445/√14) = 1.857/0.653 ≈ 2.84.
QUESTION 34
The cereal test p-value is 0.007. How do you interpret this?
- There is a 0.7% chance people buy more cereal with a toy coupon
- If the toy has no effect, there is a 0.7% chance of seeing a mean difference this large or larger by chance
- The toy coupon increases sales by 0.7%
- H₀ is true with 99.3% probability
Show Answer
B. The p-value is the probability of the observed result (or more extreme) assuming H₀ is true (no effect). It is NOT the probability H₀ is true.
QUESTION 35
Why can't you use a two-sample t-test for the SAT prep data?
- The sample size is too small
- The scores are not Normal
- The "before" and "after" scores come from the same students — they're paired, not independent
- SAT scores can't be used in t-tests
Show Answer
C. The same students were measured before and after. The scores are not independent — the paired t-test is required to properly account for the within-student correlation.
QUESTION 36
If the two-sample t-test for cholesterol gave p = 0.0002, what is the correct interpretation?
- The new drug works 99.98% of the time
- If both drugs were equally effective, there is a 0.02% chance of observing this large a difference by chance
- H₀ is false with probability 0.9998
- The new drug reduces cholesterol by 0.02% more
Show Answer
B. p-value interpretation: assuming H₀ is true (equal means), the probability of observing a difference this extreme is 0.0002 — extremely unlikely, strong evidence against H₀.
QUESTION 37
A Type I error in the cereal toy study would mean:
- Concluding the toy increases sales when it actually doesn't
- Concluding the toy has no effect when it actually does
- Using paired instead of two-sample procedure
- Having too few subjects in the study
Show Answer
A. Type I error = rejecting H₀ when it is true = concluding the toy increases sales (Hₐ) when in reality μ_d = 0 (H₀ true).
QUESTION 38
For a two-sample test with n₁ = n₂ = 50, which change would most increase power?
- Decrease α from 0.05 to 0.01
- Use a two-sided instead of one-sided test
- Increase each sample to n = 100
- Use the conservative df instead of Welch df
Show Answer
C. Doubling the sample size reduces the SE, making the test statistic larger and power higher. The other options either decrease power or have minimal effect.
QUESTION 39
In a two-sample t-test, the Random condition requires:
- Only one sample to be randomly selected
- Both samples to be random, or subjects randomly assigned to treatments
- The samples to be the same size
- The populations to be Normally distributed
Show Answer
B. Both groups must come from random samples or a randomized experiment. Random assignment (as in the cholesterol study) satisfies this condition.
QUESTION 40
A researcher finds p = 0.048 with a very large sample (n = 10,000). The best conclusion is:
- The effect is large and practically important since p < 0.05
- The result is statistically significant but the effect size may be trivially small
- The result is not significant at α = 0.05
- A larger sample is needed to confirm
Show Answer
B. With huge samples, even tiny differences become statistically significant. Always consider effect size alongside p-value. Statistical significance ≠ practical importance.