Lab 10

Published

March 31, 2026

Download R code

Objectives

  • Introduction to Quarto

  • Calculating Power or Sample Size

Quarto

Why Quarto?

  1. Your Assessment

    • We will provide the .qmd starter file; you just fill in the blanks.

    • Note that submitting the .qmd is not required. We will grade whatever format you submit, but using the .qmd may be the simplest way to submit all your work.

    • Great option to keep your code and your writing in a single, organized document.

  2. Simplified Workflow

    • There is no need to code in R and manually pasting plots into Word. Quarto embeds them automatically.

    • Automatic Updates: If you fix a mistake in your code, your tables and plots update instantly when you “Render.”

      • Without Quarto, you would have to manually copy and paste your code all over again–or worse, forget to re-copy and paste either the code or a plot and have a final product that contains mistakes.
    • Professional Math: Easily format biostatistical formulas like \(P(A \cup B)\) or \(\bar{x}\).

  3. Research & Career Skills

    • Reproducibility: Anyone with access to your document could reproduce what you did (including your future self).

    • Professional Grade: Create clean reports that look great for lab rotations or job applications.

Practice

Please download the practice.qmd file and follow along with your instructor.

Download .qmd file

Quarto Highlights

  1. YML Header

    Title, author, format, date

  2. HTML Formatting

    Headers/subheaders: helps keep you organized

    bullet points

    basic text formatting: bold, italics

  3. Code Chunks

    Code

    Plots

Power and Sample Size

Power: the probability of correctly rejecting the null hypothesis

  1. Sample Size

  2. Variability

  3. Effect Size

We can calculate power, sample size, and effect size in R.

Tip: With your instructor, take a look at the Functions document for how to use power.t.test

Exercise 1: Find n

We are designing a crossover study to test a new treatment on rhinos with sleep apnea. We expect the treatment to improve sleep by 6 hours on average, with standard deviation of sleep being 4 hours, and we want a power of .99.

How many patients would we need in our study to achieve a power of .99?

  • crossover study means this is paired data

  • delta is the expected amount of improvement

# n is not specified 
power.t.test(delta = 6,
             power = .99, 
             sd = 4,
             type = "paired")

     Paired t test power calculation 

              n = 10.34334
          delta = 6
             sd = 4
      sig.level = 0.05
          power = 0.99
    alternative = two.sided

NOTE: n is number of *pairs*, sd is std.dev. of *differences* within pairs

The results give us the estimated number of pairs of rhino’s needed to achieve a power of at least .99 (given the parameters are accurate). We need at least 11 pairs.

Why do you think it only takes 11 pairs to achieve a power of 0.99?

Answer

This is because the difference between the groups (delta) is very large as it is over 1 standard deviation away from 0.

Exercise 2: Find power

Now lets say we have a sample size of 20 rhinos in this study. What would the power be?

Change the code from before slightly by excluding power and replacing it with n=20.

power.t.test(delta = 6,
             n = 20, 
             sd = 4, 
             type = "paired")

     Paired t test power calculation 

              n = 20
          delta = 6
             sd = 4
      sig.level = 0.05
          power = 0.9999941
    alternative = two.sided

NOTE: n is number of *pairs*, sd is std.dev. of *differences* within pairs

Practice

1 Power

Find the power of a paired study where the expected difference in the means is 3, the expected standard deviation of this difference is 5, and the sample size is 15.

Answer
power.t.test(delta = 3,
             n = 15,
             sd = 5, 
             type = "paired")

     Paired t test power calculation 

              n = 15
          delta = 3
             sd = 5
      sig.level = 0.05
          power = 0.5804097
    alternative = two.sided

NOTE: n is number of *pairs*, sd is std.dev. of *differences* within pairs

The power is approximately 58%.

2 One sample t-test

In the lipids dataset, the variable TRG is continuous. Suppose the national average for TRG is said to be 120. Is our data significantly different than the national average?

Answer
lipids <- read.delim('https://raw.githubusercontent.com/IowaBiostat/data-sets/main/lipids/lipids.txt')

t.test(lipids$TRG, mu = 120)

    One Sample t-test

data:  lipids$TRG
t = -2.4733, df = 3025, p-value = 0.01344
alternative hypothesis: true mean is not equal to 120
95 percent confidence interval:
 114.5234 119.3669
sample estimates:
mean of x 
 116.9451 
  • What is the p-value here? What does it indicate?

  • What is the confidence interval for our estimate?

  • How do the p-value and confidence interval lead us to the same conclusion?

Sample interpretation:

There is moderately strong evidence that the true average triglyceride levels differs from that of the reported national average (\(p=0.01344\)). Based on these data, plausible values for the true average triglyceride level are between 114.5 and 119.4.

3 Paired Sample t-test

Using the anorexia dataset, determine whether the patient’s weight improved post-treatment compared to pre-treatment.

Here’s some code to get you started with one preliminary step:

anorexia_alltreat <- read.delim("https://raw.githubusercontent.com/IowaBiostat/data-sets/main/anorexia/anorexia.txt")
# We need to index to just get the weights for those who received family treatment.
anorexia <- anorexia_alltreat[anorexia_alltreat$Treat == "FT",]
Answer

Manual Calculation

First, we want to calculate the difference in weight for each patient.

# create a new column
anorexia$diff <- anorexia$Postwt-anorexia$Prewt

Take a look at the new column “diff”. Are most of the values positive or negative? What does this indicate?

Now complete the calculation.

SE <- sd(anorexia$diff)/sqrt(17)

t <- (mean(anorexia$diff) - 0) / SE

2*(1-pt(t, df=16))
[1] 0.0007002531

A “paired t-test” does the same thing as the one sample t-test above however it does it on the difference between the paired observations. To run a paired t-test in R we will need to use the code below:

Using t.test

## Test for Post - Prior
t.test(anorexia$Postwt, anorexia$Prewt, paired = TRUE)

    Paired t-test

data:  anorexia$Postwt and anorexia$Prewt
t = 4.1849, df = 16, p-value = 0.0007003
alternative hypothesis: true mean difference is not equal to 0
95 percent confidence interval:
  3.58470 10.94471
sample estimates:
mean difference 
       7.264706 

4 Concept

What is different between the procedures for a paired t-test and a one sample test?

Answer

There is no difference in the procedures.

A “paired t-test” does the same thing as the one sample t-test above. These are distinguishable only by the origin of the data: paired data takes the differences, “one sample” has no pairs.