Lab 2

Published

January 27, 2026

Download the R code

Objectives:

  1. Address some general information about the labs and course data sets

  2. Meet our neighbors

  3. Practice controlling for confounding factors

General Information

  • Navigating Course Datasets: Navigate to the course website and follow along with your instructor.

  • Using Software: While the lab will be teaching the basics of analysis in R, we emphasize that the objective is software literacy. Feel free to use other software if you would like. If you are concerned about whether using your chosen software will complicate the final assessment, please reach out.

Meet Your Neighbors

Turn to the person on your left or right and take 30 seconds to get to know them.

Now turn to the person on your other side and take 30 seconds to get to know them.

Mortality on the Titanic

Historical Note

The RMS Titanic was a British luxury steamship that embarked on its maiden voyage from Southampton to New York City in April 1912. The vessel was widely considered a marvel of modern engineering and was famously labeled as practically unsinkable due to its advanced system of watertight compartments. However, the ship struck an iceberg in the North Atlantic and sank in less than three hours, leading to a massive loss of life and a total overhaul of international maritime safety regulations.

Suppose we want to investigate survivability rates for each class aboard the Titanic. Since we believe that gender is a potential confounding factor, we will want to control for this in our final summary.

For the sake of practice, we will first work through the manual calculations and then run through the process using R to check our work.

Manual Calculations Controlling for Sex

Reference Tables

Survival totals by:

      
       Died Survived Sum
  1st   122      203 325
  2nd   167      118 285
  3rd   528      178 706
  Crew  673      212 885
        
         Died Survived  Sum
  Female  126      344  470
  Male   1364      367 1731
, , Sex = Female

      Survived
Class  Survived Total
  1st       141   145
  2nd        93   106
  3rd        90   196
  Crew       20    23

, , Sex = Male

      Survived
Class  Survived Total
  1st        62   180
  2nd        25   179
  3rd        88   510
  Crew      192   862

A

Using the tables provided above, calculate the overall percentages of survival for each class.

B

For each class, calculate the percentage of passengers that survived for females and males, respectively.

C

Calculate the proportion (fraction) of female and male passengers on the ship.

D

Construct a weighted average of the percentage of passengers in each class who survived, controlling for the effect of sex (there will only be one number for each class).

Compare this answer to the proportions you calculated in part A. What insight do we gain from controlling for sex?

Using R

# read in our dataset
# What information does each column contain?
titanic <- read.delim('https://raw.githubusercontent.com/IowaBiostat/data-sets/main/titanic/titanic.txt')

A

Calculate the overall percentages of survival for each class.

# create table of counts
tclass <- table(titanic$Class, titanic$Survived) |>
  addmargins() # creates automatic 'sum' column
tclass[,2] / tclass[,3] # divide count of 'survived' by the total
      1st       2nd       3rd      Crew       Sum 
0.6246154 0.4140351 0.2521246 0.2395480 0.3230350 

B

For each class, calculate the percentage of passengers that survived for females and males, respectively.

classtable <- table(titanic$Sex,titanic$Class,titanic$Survived)
classes <- prop.table(classtable, 1:2)[,,2]
t(classes) 
      
          Female      Male
  1st  0.9724138 0.3444444
  2nd  0.8773585 0.1396648
  3rd  0.4591837 0.1725490
  Crew 0.8695652 0.2227378

C

Calculate the proportion (fraction) of female and male passengers on the ship.

weights <- with(titanic, table(Sex)) |> 
  prop.table()
weights
Sex
   Female      Male 
0.2135393 0.7864607 

D

Construct a weighted average of the percentage of passengers in each class who survived, controlling for the effect of sex (there will only be one number for each class).

# first we create vectors that contain only survival proportions for females and males, respectively
fem_props <- t(classes)[1:4, 1]
mal_props <- t(classes)[1:4, 2]

# we can now weight those proportions by the percentage of passengers of each sex
(fem_props * weights[1]) + (mal_props * weights[2])
      1st       2nd       3rd      Crew 
0.4785406 0.2971914 0.2337568 0.3608609 

Extra Practice

Now let’s say that we want to investigate the difference in survival by sex for the Titanic data set. Use direct standardization to calculate the percentage of passengers for each sex who survived, controlling for the effect of class.

A

Find the proportion of people who survived by sex.

B

Find the percent of passengers of each sex broken down by class.

C

Find total percent of passengers for each class.

D

Construct a direct standardization of the percentage of passengers in each sex who survived, controlling for the effect of class (there will only be one number for each sex).

Solutions

Handwritten solutions can be found here

A

Proportion of people who survived by sex:

# Create a table of Sex × Survived counts
sextab <- table(titanic$Sex, titanic$Survived)

# Compute proportion survived for each sex
prop_survived_sex <- sextab[, "Survived"] / rowSums(sextab)

# Print the result
prop_survived_sex
   Female      Male 
0.7319149 0.2120162 

B

Percent of passengers of each sex broken down by class:

# Create a 3-way table: Class × Sex × Survived
classtab <- table(titanic$Class, titanic$Sex, titanic$Survived)

# Compute survival proportion for each Class × Sex
prop_survived <- classtab[,, "Survived"] / 
                 (classtab[,, "Survived"] + classtab[,, "Died"])

# Print only the final table
prop_survived
      
          Female      Male
  1st  0.9724138 0.3444444
  2nd  0.8773585 0.1396648
  3rd  0.4591837 0.1725490
  Crew 0.8695652 0.2227378

C

Total percent of passengers for each class.

# Create a table of counts for each class (all sexes combined)
class_counts <- table(titanic$Class)

# Compute proportion of total passengers in each class
class_proportions <- class_counts / sum(class_counts)

# Print the result
class_proportions

      1st       2nd       3rd      Crew 
0.1476602 0.1294866 0.3207633 0.4020900 

D

Construct a direct standardization of the percentage of passengers in each sex who survived, controlling for the effect of class (there will only be one number for each sex).

# survival proportions by sex and class
fem_props <- t(classes)[1:4, 1]  # female survival rates per class
mal_props <- t(classes)[1:4, 2]  # male survival rates per class

# proportion of passengers in each class (class weights)
class_counts <- table(titanic$Class)
total_passengers <- sum(class_counts)
class_weights <- class_counts / total_passengers

# weighted average survival for each sex, controlling for class
weighted_survival_female <- sum(fem_props * class_weights)
weighted_survival_male   <- sum(mal_props * class_weights)

# combine into a vector
weighted_survival_by_sex <- c(F = weighted_survival_female,
                              M = weighted_survival_male)

# print
weighted_survival_by_sex
        F         M 
0.7541256 0.2138535 

Handwritten solutions