Lab 3

Published

February 3, 2026

Objectives

  1. Review the relationship between hypothesis tests and confidence intervals

  2. Practice conducting a hypothesis test

  3. Practice interpreting confidence intervals

Review

Recall that one of our objectives in this course is to help you learn how to think statistically. Consider with your instructor the following questions:

One reason that rigor is so important is to prevent harm to an individual or society as a whole. Scientific learning is messy by nature as we try to understand the world around us, but the greater care we take to ensure our conclusions are reasonable will provide a safeguard to all.

“the probability of obtaining results as extreme or more extreme than the one observed in the sample, given that the null hypothesis is true.” -Lecture Slides Hypothesis Tests slide 6

Remember: p-values do not assess the design of a study

Here are some ideas presented in lecture, - insisting 5% cutoff

  • misinterpreting the p-value

  • concluding that a high p-value means that the Null hypothesis is probably true

  • reading too far into the term “statistically significant”

  • Type I: Rejecting the Null hypothesis in a situation where it is true.

  • Type II: Failing to reject the Null Hypothesis in a situation where it is false.

-Lecture Slides Hypothesis Tests slide 15

Hypothesis Testing and Confidence Intervals

There is close relationship between confidence intervals and hypothesis testing. All values within a constructed 95% interval are considered “plausible” values for the parameter that we are estimating. Values outside the interval are rejected as unlikely and improbable.

Confidence Intervals

If you were to repeat the process of creating a confidence interval an infinite number of times, 95% of the interval estimates for the parameter of interest will contain the true parameter value. In this class, we almost always are interested in the true mean value, \(\mu\). We treat the population mean \(\mu\) as being fixed. Any particular interval may or may not contain the true population mean \(\mu\).

  • We say that we are “95% confident” that the interval contains the true population \(\mu\) because the procedure used to construct this interval produces a correct interval estimate 95% of the time.

  • We DO NOT say there is a 95% probability that \(\mu\) lies between these two values. (\(\mu\) is fixed)

Hypothesis Testing

A hypothesis is simply a statement about the truth. Since we don’t know what the truth is (that’s why we are doing a study in the first place), the statement is essentially meaningless–or null–until we have evidence against it.

In class, you learned that there are a lot of wrong ways to think about the hypothesis testing process. The courtroom is a helpful example that illustrates the correct usage of p-values and hypothesis tests. Let’s look at it in terms of “innocent until proven guilty”: As the person analyzing data, you are the judge. The hypothesis test is the trial, and the null hypothesis is the defendant.

If the evidence presented doesn’t prove the defendant is guilty beyond a reasonable doubt, you still have not proved that the defendant is innocent. (We never say that we accept the null hypothesis)

So how would that verdict be announced? It enters the court record as “Not guilty.” That phrase is perfect: “Not guilty” doesn’t mean the defendant is innocent, because that has not been proven. It just means the prosecution couldn’t prove its case to the necessary, “beyond a reasonable doubt” standard. It failed to convince the judge to abandon the assumption of innocence.

If you follow that rationale, then you can see that “failure to reject the null” is just the statistical equivalent of “not guilty.” In a trial, the burden of proof falls to the prosecution. When analyzing data, the entire burden of proof falls to the sample data you’ve collected. This is why our sampling procedure is so important. Just as “not guilty” is not the same thing as “innocent,” neither is “failing to reject” the same as “accepting” the null hypothesis.

This method of thinking about hypothesis tests will come in handy when we start formally testing our own hypotheses.

Source: http://blog.minitab.com/blog/understanding-statistics/things-statisticians-say-failure-to-reject-the-null-hypothesis

Relationship between Confidence Intervals & Hypothesis Testing

If the value of the parameter specified by the null hypothesis (for instance \(H_0\) = 0) is contained within the 95% interval, then the null hypothesis cannot be rejected at the 0.05 level. If the value specified by the null hypothesis is not in the interval, then the null hypothesis can be rejected at the 0.05 level. Likewise, for a 99% confidence interval, if the value specified by the null hypothesis is in the interval, then the null hypothesis cannot be rejected at the 0.01 level.

Important

While we do reach the same conclusion about a study testing a hypothesis or constructing a confidence interval, you will still need to know how to do both for future homeworks and quizzes.

Practice Problems

In lab last week we worked with the titanic data set. Today we are wanting to know whether sex played a significant role in the survival rates of the passengers on-board. Therefore, we want to compare survival rates between males and females.

  1. Define the null hypothesis for this study on the ‘titanic’ data set.

  2. Say for example that we have the following null hypothesis \(H_0:\mu_{female}\) = 0.5. We obtain a 95% confidence interval (0.415, 0.481). Remember that interpretation of this confidence interval states that we are 95% confident that the true population \(\mu\) lies within this interval. Would we reject or retain the null hypothesis?

True or False?

  1. Suppose we conduct a hypothesis test and calculate a p-value of 0.03. The 95% confidence interval would contain the specified null hypothesis.

  2. Suppose we conduct a hypothesis test and calculate a p-value of 0.56. The 99% confidence interval would contain the specified null hypothesis.

  3. Suppose a 95% confidence interval does not contain the specified null hypothesis. The p-value would then be above 0.05.

Multiple Choice

Choose the interpretation that best fits each scenario. For problems 7 and 8 write the null hypothesis. 1

  1. A researcher investigates whether a new biodegradable filter reduces microplastic concentrations in seawater compared to the standard mesh filter. The standard filter averages 50 particles per liter.

    \(H_0\) : The new filter does not reduce microplastic concentration (\(\mu \ge 50\)).

    Results: After 30 trials, the new filter had a mean of 42 particles per liter with a p-value of 0.03.

    1. There is a 3% chance that the null hypothesis is true, so we should reject it and adopt the new filter.

    2. If the new filter actually performs the same as the old one, there is only a 3% chance of seeing a result this low (or lower) due to random sampling error. We have sufficient evidence to support the new filter’s effectiveness.

    3. The p-value of 0.03 proves that the new filter is 97% more effective at removing plastics than the standard mesh filter.

  2. A study examines whether “Productivity App X” changes the average time it takes for students to complete a standardized task. A researcher conducts a study in which 50 students do not use the app to complete the task and 50 students do use the app to complete the task. The researcher is interested in if the average time it takes to complete the task differs between the groups.

    Results: The non app users took an average of 23.45 minutes to complete the task and the app users took an average of 25.13 minutes. The p-value calculated from the test was 0.45.

    1. Because the p-value (0.45) is greater than 0.05, there is not enough evidence to conclude that the app effects completion time.

    2. The high p-value proves that the app has no effect on students and that the null hypothesis is definitely true.

    3. There is a 45% chance that the app actually works, but we need a larger sample size to make the p-value smaller.

  3. A marine biologist studies whether boat traffic affects the average dive duration of narwhals.The researcher collects data from 50 narwhals in low-traffic waters and 50 narwhals in high-traffic waters. The researcher is interested in whether the average dive time differs between the two groups.

    Results:

    • Low-traffic group mean dive time: 21.6 minutes

    • High-traffic group mean dive time: 22.1 minutes

    • p-value = 0.45

    A. There is a 45% chance that boat traffic affects dive time, but a larger sample size would make the p-value smaller.

    B. The high p-value proves that boat traffic has no effect on narwhal dive duration.

    C. Because the p-value (0.45) is greater than 0.05, there is not enough evidence to conclude that boat traffic affects dive time.

    D. Since the means are different, boat traffic must affect dive time even though the p-value is large.

  4. A sports scientist investigates whether Alex Honnold’s free-solo training method changes the average climbing completion time compared to the historical average of 75 minutes.

    Results: A sample of 25 climbers trained using Honnold’s method produced a 95% confidence interval for the mean completion time of: (72.8, 77.4).

    A. There is a 95% probability that the true mean completion time lies between 72.8 and 77.4 minutes.

    B. Because 75 minutes is inside the interval, there is not enough evidence to conclude that the training method changes completion time.

    C. The interval proves that the training method reduces completion time by at least 2.2 minutes.

    D. There is a 95% chance that every climber trained with this method finishes between 72.8 and 77.4 minutes.

Solutions

Problem 1

\(H_0\): average survival rate for females = average survival rate for males

Problem 2 Reject the null hypothesis
Problem 3 False
Problem 4 True
Problem 5 False
Problem 6 Answer: B Discuss with your neighbor why this is the correct answer.
Problem 7

Answer: A Discuss with your neighbor why this is the correct answer.

\(H_0: \mu_{1} = \mu_{2}\)
Problem 8

Answer: C Discuss with your neighbor why this is the correct answer.

\(H_0: \mu_{high} = \mu_{low}\)
Problem 9 Answer: B Discuss with your neighbor why this is the correct answer.

Footnotes

  1. Examples generated using Google’s Gemini 3 AI engine and modified to align with course expectations↩︎