We will provide the .qmd starter file; you just fill in the blanks.
Note that submitting the .qmd is not required. We will grade whatever format you submit, but using the .qmd may be the simplest way to submit all your work.
Great option to keep your code and your writing in a single, organized document.
Simplified Workflow
There is no need to code in R and manually pasting plots into Word. Quarto embeds them automatically.
Automatic Updates: If you fix a mistake in your code, your tables and plots update instantly when you “Render.”
Without Quarto, you would have to manually copy and paste your code all over again–or worse, forget to re-copy and paste either the code or a plot and have a final product that contains mistakes.
Professional Math: Easily format biostatistical formulas like \(P(A \cup B)\) or \(\bar{x}\).
Research & Career Skills
Reproducibility: Anyone with access to your document could reproduce what you did (including your future self).
Professional Grade: Create clean reports that look great for lab rotations or job applications.
Practice
Please download the practice.qmd file and follow along with your instructor.
Power: the probability of correctly rejecting the null hypothesis
Important: What three things does the power of a study depend on?
Sample Size
Variability
Effect Size
We can calculate power, sample size, and effect size in R.
Tip: With your instructor, take a look at the Functions document for how to use power.t.test
Exercise 1: Find n
We are designing a crossover study to test a new treatment on rhinos with sleep apnea. We expect the treatment to improve sleep by 6 hours on average, with standard deviation of sleep being 4 hours, and we want a power of .99.
How many patients would we need in our study to achieve a power of .99?
Answer
crossover study means this is paired data
delta is the expected amount of improvement
# n is not specified power.t.test(delta =6,power = .99, sd =4,type ="paired")
Paired t test power calculation
n = 10.34334
delta = 6
sd = 4
sig.level = 0.05
power = 0.99
alternative = two.sided
NOTE: n is number of *pairs*, sd is std.dev. of *differences* within pairs
The results give us the estimated number of pairs of rhino’s needed to achieve a power of at least .99 (given the parameters are accurate). We need at least 11 pairs.
Why do you think it only takes 11 pairs to achieve a power of 0.99?
Answer
This is because the difference between the groups (delta) is very large as it is over 1 standard deviation away from 0.
Exercise 2: Find power
Now lets say we have a sample size of 20 rhinos in this study. What would the power be?
Answer
Change the code from before slightly by excluding power and replacing it with n=20.
power.t.test(delta =6,n =20, sd =4, type ="paired")
Paired t test power calculation
n = 20
delta = 6
sd = 4
sig.level = 0.05
power = 0.9999941
alternative = two.sided
NOTE: n is number of *pairs*, sd is std.dev. of *differences* within pairs
Practice
1 Power
Find the power of a paired study where the expected difference in the means is 3, the expected standard deviation of this difference is 5, and the sample size is 15.
Answer
power.t.test(delta =3,n =15,sd =5, type ="paired")
Paired t test power calculation
n = 15
delta = 3
sd = 5
sig.level = 0.05
power = 0.5804097
alternative = two.sided
NOTE: n is number of *pairs*, sd is std.dev. of *differences* within pairs
The power is approximately 58%.
2 One sample t-test
In the lipids dataset, the variable TRG is continuous. Suppose the national average for TRG is said to be 120. Is our data significantly different than the national average?
Answer
lipids <-read.delim('https://raw.githubusercontent.com/IowaBiostat/data-sets/main/lipids/lipids.txt')t.test(lipids$TRG, mu =120)
One Sample t-test
data: lipids$TRG
t = -2.4733, df = 3025, p-value = 0.01344
alternative hypothesis: true mean is not equal to 120
95 percent confidence interval:
114.5234 119.3669
sample estimates:
mean of x
116.9451
What is the p-value here? What does it indicate?
What is the confidence interval for our estimate?
How do the p-value and confidence interval lead us to the same conclusion?
Sample interpretation:
There is moderately strong evidence that the true average triglyceride levels differs from that of the reported national average (\(p=0.01344\)). Based on these data, plausible values for the true average triglyceride level are between 114.5 and 119.4.
3 Paired Sample t-test
Using the anorexia dataset, determine whether the patient’s weight improved post-treatment compared to pre-treatment.
Here’s some code to get you started with one preliminary step:
anorexia_alltreat <-read.delim("https://raw.githubusercontent.com/IowaBiostat/data-sets/main/anorexia/anorexia.txt")# We need to index to just get the weights for those who received family treatment.anorexia <- anorexia_alltreat[anorexia_alltreat$Treat =="FT",]
Answer
Manual Calculation
First, we want to calculate the difference in weight for each patient.
# create a new columnanorexia$diff <- anorexia$Postwt-anorexia$Prewt
Take a look at the new column “diff”. Are most of the values positive or negative? What does this indicate?
Now complete the calculation.
SE <-sd(anorexia$diff)/sqrt(17)t <- (mean(anorexia$diff) -0) / SE2*(1-pt(t, df=16))
[1] 0.0007002531
A “paired t-test” does the same thing as the one sample t-test above however it does it on the difference between the paired observations. To run a paired t-test in R we will need to use the code below:
Using t.test
## Test for Post - Priort.test(anorexia$Postwt, anorexia$Prewt, paired =TRUE)
Paired t-test
data: anorexia$Postwt and anorexia$Prewt
t = 4.1849, df = 16, p-value = 0.0007003
alternative hypothesis: true mean difference is not equal to 0
95 percent confidence interval:
3.58470 10.94471
sample estimates:
mean difference
7.264706
4 Concept
What is different between the procedures for a paired t-test and a one sample test?
Answer
There is no difference in the procedures.
A “paired t-test” does the same thing as the one sample t-test above. These are distinguishable only by the origin of the data: paired data takes the differences, “one sample” has no pairs.