Lab 1

Published

January 20, 2021

Download the R code

Objectives

In today’s lab we will:

  1. Introduce the programming language, R
  2. Install and setup R and RStudio
  3. Practice basic functions and commands using RStudio

It will be helpful for you to be familiar with the basics of programming. R can be used to do simple calculations, create plots and figures,and to run statistical analyses. In lab, we will provide you with practice completing basic data import and analysis procedures as a tool for scientific inquiry in your respective fields.

Important

There will be an assessment of your software skills on April 21, 2026.

Setting up H: Drive or OneDrive

It is important that we all have the same set-up for folders/files. You may login to your OneDrive account using your UIowa credentials. Alternatively, you can use the H: Drive if you have one.

  1. Open your File Explorer. This can be done by clicking on the windows icon in the bottom left of your screen, and then clicking on the icon of the page labeled “File Explorer” on the left of that menu (or search for file explorer).

  2. Navigate to your OneDrive: Click on your C: Drive, which can be found on the lower part of the menu on the left under “This PC”. Open the Users folder and find the folder labeled with your HawkID. Open it.

  3. Click on the ‘OneDrive - University of Iowa’ Icon and follow the steps to login to your account. Once finished, your OneDrive account should appear as a folder in the toolbar on the left-hand side of the file explorer. Open this folder.

  4. Create a new folder by right-clicking in the blank space on this page, selecting “New”, and then selecting “Folder”. Title this Folder “BIOS4120Labs” (without the quotations).

  1. Open your File Explorer. This can be done by clicking on the windows icon in the bottom left of your screen, and then clicking on the icon of the page labeled “File Explorer” on the left of that menu (or search for file explorer).

  2. Click on your H: Drive, which can be found on the lower part of the menu on the left under “This PC”. It should be labeled “(H:) HawkID”.

  3. Create a new folder by right-clicking in the blank space on this page, selecting “New”, and then selecting “Folder”. Title this Folder “BIOS4120Labs” (without the quotations).

Downloading and Installing R

We will be using an R program called RStudio that allows us to code in R. On the lab computers, which we be using in discussion sessions, you can just open up RStudio. However, to get it on your personal computer it is a two-step process.

  1. To install R on your personal computer, go to http://cran.r-project.org
  2. Then install RStudio on your personal computer, go to https://posit.co/downloads/
Note

You have to install R first before installing RStudio.

The instructions are pretty clear on the website, but if you need help please feel free to ask a TA to assist you during office hours.

Interface: The Layout of RStudio

Look for RStudio in the start menu, and go ahead and open it up.

The first thing you’ll want to do is go to File -> New File -> RScript.

This will open a window on the top left of your screen in RStudio where you’ll be doing all of your work.

Save this R Script into your newly created folder by doing File -> Save As… -> YourName - University of Iowa -> Bios4120Labs, and call it “Lab-01”.

You’ll now have four windows open in RStudio:
1. Script (top left)
2. Console (bottom left)
3. Variables (top right)
4. Graphs/Help (bottom right)

Note

To run code, type it in the script, then highlight it and hit Ctrl+Enter to send it to the console to run.

R Basics

R can act as a calculator

4 + 6 - (24/6)
[1] 6
(6 - 4) * 3
[1] 6
5 ^ 2
[1] 25

Adding Comments

Often in programming languages, you can provide comments within code that explains what the code does and allows you to leave notes for yourself. In R, to start a comment a # is used and everything on the same line immediately to the right of the # is commented out.

# Example of a comment. This line does NOT get run by the program. 

In the R Script, comments are a different color. Remember, the # starts a comment ONLY on the same line.

Functions

exp(2) # This is the number e raised to the power within the parentheses
[1] 7.389056
sqrt(4) # Square root
[1] 2
abs(-5) # Absolute value
[1] 5

Help Documentation

To access the help documentation on a function you’re not sure about, type a question mark before the function. For example, try typing ?seq or using the search bar on the help tab.

Variables

Assign

x <- 5 # This assigns the value 5 to the variable x.
# Now we can reference x, and R substitutes in 5.
x
[1] 5
# This is useful for things like 
y <- log(5) + 3/2 # "log" in R is natural log (ln)
y
[1] 3.109438

Override

You can “override” variables with new values after they’ve been assigned. You can also use the variable itself or other variables to assign a new value:

x <- 2 + 5
x
[1] 7
x <- x + 5
x
[1] 12
x <- y + 5
x
[1] 8.109438
Note

R is case-sensitive, which means capital letters are different from lower case letters. Therefore, X would be different from x.

X <- 3 # uppercase X
x <- 6 # lowercase x
X
[1] 3
x
[1] 6

Also note that the assignment operator (<-) is preferred to the equals sign. This is standard R programming practice.

Tip

You can press alt and minus at the same time as a shortcut to create the assignment operator (<-) for Windows.

Reading in Data

All of the datasets for this class will be on the class website, and can be read in using the URL:

lab1 <- read.delim('https://raw.githubusercontent.com/IowaBiostat/data-sets/main/tips/tips.txt')

Some basic things you can do with datasets:
(This will be expanded upon throughout the semester.)

head(lab1) # Outputs only the first few lines of a dataset
  TotBill  Tip Sex Smoker Day  Time Size
1   18.29 3.76   M    Yes Sat Night    4
2   16.99 1.01   F     No Sun Night    2
3   10.34 1.66   M     No Sun Night    3
4   21.01 3.50   M     No Sun Night    3
5   23.68 3.31   M     No Sun Night    2
6   24.59 3.61   F     No Sun Night    4
summary(lab1) # Gives summary statistics for the dataset
    TotBill           Tip             Sex               Smoker         
 Min.   : 3.07   Min.   : 1.000   Length:244         Length:244        
 1st Qu.:13.35   1st Qu.: 2.000   Class :character   Class :character  
 Median :17.80   Median : 2.900   Mode  :character   Mode  :character  
 Mean   :19.79   Mean   : 2.998                                        
 3rd Qu.:24.13   3rd Qu.: 3.562                                        
 Max.   :50.81   Max.   :10.000                                        
     Day                Time                Size     
 Length:244         Length:244         Min.   :1.00  
 Class :character   Class :character   1st Qu.:2.00  
 Mode  :character   Mode  :character   Median :2.00  
                                       Mean   :2.57  
                                       3rd Qu.:3.00  
                                       Max.   :6.00  

Let’s say that we only want to look at one variable (a column) within the dataset. (For this example, we will choose to only look at the tip amounts.) We are able to reference a single variable by implementing a dollar sign and using the form dataset$variable.

#    dataset$variable
tips <- lab1$Tip
head(tips)
[1] 3.76 1.01 1.66 3.50 3.31 3.61

Now we are able to analyze a specific variable from the dataset which will be very useful in future labs. We can use a variety of functions on this new variable. The function max() is shown as an example below.

max(tips) # Largest Tip
[1] 10

Practice questions (Not graded)

Problem 1

Use the $ operator to assign the TotBill variable from the “lab1” dataset (read in previously) to a new variable called total. Print out the first few values using the head() function.

Problem 2

Look up the help documentation for the “sort” function in R. Use to organize the “total” vector from highest to lowest. Name this new vector “total_sorted”

Use “decreasing = TRUE” in the sort function.

Problem 3

Find the average Total Bill.

Use the function mean()

Problem 4

Create a variable called “log_tip” to represent the natural log of the Tip variable from the “lab1” dataset. Then find the mean and standard deviation (sd) of the log_tip.

You will need to use $ operator.

Solutions

Problem 1

Use the $ operator to assign the TotBill variable from the lab1 dataset to a new variable called total. Print out the first few values using the head() function.

total <- lab1$TotBill
head(total) # print out the first few values
[1] 18.29 16.99 10.34 21.01 23.68 24.59
Problem 2

Look up the help documentation for the “sort” function in R. Use to organize the “total” vector from highest to lowest. Name this new vector “total_sorted”

?sort # look up the help documentation for 'sort'
total_sorted <- sort(total, decreasing = TRUE)
# note the new vector appears in your environment under "values"
Problem 3

Find the average Total Bill.

mean(total)
[1] 19.78594
Problem 4

Create a variable called “log_tip” to represent the natural log of the Tip variable from the “lab1” dataset. Then find the mean and standard deviation (sd) of the log_tip.

log_tip <- log(lab1$Tip)
mean(log_tip)
[1] 1.002538
sd(log_tip)
[1] 0.4361609