Load Survival Package
# automatically install package if it doesn't exist on your machine,
# then loads the library
if(!require(survival)){
install.packages("survivial")
library(survival)
} else {
library(survival)
}Define \(n(t)\) as the number of subjects at risk for the event in the study at time \(t\) and \(d(t)\) as the number of events that occur at time \(t\).
\[ \hat S(t) = \prod_i \frac{n(t_i) - d(t_i)}{n(t_i)} \]
# automatically install package if it doesn't exist on your machine,
# then loads the library
if(!require(survival)){
install.packages("survivial")
library(survival)
} else {
library(survival)
}First we will fit a model with no grouping variable to estimate the overall survival function. We do this by
Calculating the response with the Surv function
Fitting survfit(S ~ 1) where the 1 indicates we want to fit without a grouping variable. This calculates the survival curve that we learned how to compute in class
anemia <- read.delim('https://raw.githubusercontent.com/IowaBiostat/data-sets/main/anemia/anemia.txt')
S <- with(anemia, Surv(Time,Status!=0)) # get response
fit <- survfit(S~1) Recall if you know the time of death and number of subjects at risk, we can calculate survival probability. For example, here is the probability estimated at the first five events and the cumulative product used to estimate the survival curve:
time | n(t) | d(t) | [n(t)-d(t)]/n(t) | cumproduct |
|---|---|---|---|---|
3 | 46 | 1 | 0.9783 | 0.9783 |
12 | 45 | 1 | 0.9778 | 0.9565 |
25 | 44 | 1 | 0.9773 | 0.9348 |
30 | 43 | 1 | 0.9767 | 0.9130 |
44 | 42 | 1 | 0.9762 | 0.8913 |
To plot the entire estimated survival curve, use:
plot(fit, ylab = "Probability", xlab = "Time")This is the Kaplan-Meier survival function estimate of the survival function, ignoring the different treatment groups.
Now stratifying by group:
fit2 <- with(anemia, survfit(S ~ Trt))
plot(fit2, ylab = "Overall Survival",
xlab = "Time",
col = c("red","blue"))
legend("bottomleft", c("MTX","MTX + CSP"),
text.col = c("red","blue"), bty = "n")\(H_0:\) The survival curves are equal.
fit2 <- with(anemia, Surv(Time, Status != 0) ~ Trt)
survdiff(fit2)Call:
survdiff(formula = fit2)
N Observed Expected (O-E)^2/E (O-E)^2/V
Trt=MTX 24 9 6.45 1.007 2.01
Trt=MTX+CSP 22 4 6.55 0.992 2.01
Chisq= 2 on 1 degrees of freedom, p= 0.2
The following points of misunderstanding have come up frequently in the homework throughout the semester:
Read the following case studies and indicate what statistical method might be best to for each situation:
In a study of 16 overweight young adults in India, participants were given, in turns, a dose of an extract made from unroasted coffee beans and a placebo, three times a day over 22 weeks. Their diet throughout the study was unchanged, and they were physically active. Between trials, the participants were given a two-week break for their bodies to reset. Though a few participants given the extract only lost 7 pounds, others lost as much as 26 pounds. On average, the subjects lost 17.5 pounds each, and reduced their body weight by 10.5 percent. Body fat also declined by 16 percent, even though the participants were eating an average of 2,400 calories and burning roughly 400.
Answer
Paired t-test
Researchers from Penn State found that increasing the amount of spices in your diet may lower the level of potentially harmful fat in your bloodstream. The experiment compared two groups of healthy, overweight men. One group ate meals seasoned with the special spice blend; the other ate the same meals prepared without the spices. Men who ate the spicy food saw a decrease of one-third in the level of triglycerides (a type of fat linked to heart disease) in their bloodstreams, and 20 percent lower insulin levels overall — even when the meals were high in fat and made with heavy oils.
Answer
2 sample t-test
Researchers at Colgate wished to test the effectiveness of a new toothpaste. They collected a sample of 143 individuals and assigned them to either use the current Colgate toothpaste or the new toothpaste for 2 weeks. Participants waited one week and then switched to using the other toothpaste for two weeks. Based on plaque build-up, they determined that 77 participants did better on the new toothpaste than the old. (Note: This study is fictional)
Answer
Binomial Exact test or paired t-test
Exposure to cosmic radiation during deep-space missions may damage an astronaut’s heart, a new NASA-funded study suggests. Researchers at Florida State University compared the deaths of 35 astronauts who never traveled into space with those of 42 astronauts who ventured beyond Earth’s protective magnetic field, including seven Apollo veterans who flew to the moon between 1968 and 1972. The study found that lunar astronauts were five times more vulnerable to heart disease—43 percent of them died from cardiovascular ailments compared with only 9 percent of the astronauts that didn’t journey to the moon. A follow-up study involving mice reveals that radiation can trigger long-term changes in the lining of blood vessels associated with atherosclerosis, or “hardening of the arteries.”
Answer
Chi-sq or Fisher’s exact test
An investigator collected the annual earnings of 1642 Iowans and 1563 Nebraskans to compare income level by state. The Iowa group had a mean of $65,000, a median of $59,000, and a standard deviation of $12,000. The Nebraska group had a mean of $64,000, a median of $61,000, and a standard deviation of $12,000. (Note: This study is fictional)
Answer
Log transformed 2-sample t-test or Mann-Whitney/Wilcoxon Rank Sum test
Researchers at the University of College London surveyed nearly 8,000 participants over the age of 52 attempting to measure whether people read the instructions on medication bottles. Using a fake aspirin bottle complete with instructions as the testing instrument, researchers asked participants to answer four basic questions, including “What is the maximum number of days you may take this medicine?” and “List three situations for which you should consult a doctor.” All the answers could be found on the label. One third of the adults failed to correctly answer all four questions, and one in eight got two or more wrong. Researchers then monitored the volunteers’ health for five years. During that time, 621 of the participants died, and people who missed two or more questions were more than twice as likely to have died than those who got the answers correct.
Answer
Chi-squared or Fisher’s Exact
In a study published in Psychological Science, researchers had groups of participants ages 18 to 65 perform simple exercises, such as pressing a button when a letter appeared onscreen or tapping in time with their own breathing. The experts checked periodically to ask the volunteers whether their minds were on the task or they were thinking of something else. At the end, participants were tested on their ability to remember a series of letters while doing math problems; individuals who let their mind wander scored higher on the test.
Answer
2 sample t-test
According to the US Census Bureau, the national poverty rate is 11.5%. We wish to see if poverty in Johnson County differs significantly than the national average. We collect a random sample of 1000 individuals and record whether or not they fall into the “poverty” category based on their income.
Answer
Binomial Exact test