Getting Started in RStudio Notebooks

This is the first draft of a post that was featured on storybench.org. Please check out that version if you run into issues here.

 

R is a powerful statistical programming language for manipulating, graphing, and modeling data. One of the major positive aspects of R is that it’s open-source (free!). But “free” does not mean “easy.” Learning to program in R is time-consuming (and occasionally frustrating), but fortunately, the number of helpful tools and packages is constantly increasing.

Enter RStudio

RStudio is an integrated development environment (IDE) that streamlines the R programming workflow into an easy to read layout. RStudio also includes useful tools (referred to as packages) for data manipulation, cleaning, restructuring, visualizations, report writing, and publishing to the web.

Just like R, it’s free. RStudio recently released R Markdown Notebooks, a nice integration of code, plain text, and results that can be exported into PDF, .docx, or HTML format.


Getting started

Start out by installing R and RStudio (you’ll need the preview version found here)

*If you need help installing R or RStudio, feel free to use this Google doc installation guide.

The IDE environment has four panes (seen below),
RStudio_setup

As you can see from the image above, the upper left pane (where I’m writing this tutorial) is the editor. The pane to the right (where it says “Environment is empty“), will show the working dataset. The lower left pane is called the console, which runs the R code. And the pane in the bottom right will display my results.

Opening a New R Notebook

To get started, click on “File” > “New File” > “R Notebook”. R Notebooks automatically start off with a title and some sample code. To see how the analysis is weaved into the Html click on the small “play” button:

play button

Save the file (“File” > “Save”) and then click on “Preview” at the top of the pane.

preview button.png

I don’t want to spoil the suspense, so I won’t put a screenshot of what you’ll see. Just know that R Notebooks does a great job of combining markdown text, R code, and results in a clean, crisp, easy-to-share finished product.

R syntax – numbers & text

You can use RStudio as a simple calculator. Type 2 + 2 directly into the console and press enter. You should see this:

2 + 2
[1] 4

 

You’re probably hoping to use RStudio for something slightly more advanced than simple arithmetic. Fortunately, R can calculate and store multiple values in variables to reference later. This is done with the <- assignment operator:

x <- 2 + 2

The <- is similar to the = sign. In fact, the = sign does the same thing, but the typical convention in R is the <-. To see the contents of x , enter it into the console and press enter.

x
[1] 4

You can also perform mathematical operations with variables. Store 4 + 4 in a variable called y and add it the variable x

y <- 4 + 4
y + x
[1] 12

R identifies numbers and text, or “string” characters. Text can also be stored into variables using the <- symbol and quotations.

a <- "human"
b <- "error"

Text strings are stored differently than numerical data in R. The commands used to manipulate strings are also slightly different.

If you want to combine two strings, use the paste function

paste(a,b)
[1] "human error"

Objects & Data Structures in R

R is an object oriented programming language, which means it recognizes objects according to their structure and type. The most common objects in R are atomic vectors and lists.

Atomic Vectors 1.1numerical & integer vectors

Numeric vectors (also called double) include “real” numbers with decimal places, while integers are whole numbers. To create numerical vectors, use the command c() which stands for concatenating (a fancy term for combining).

Below is an example of a numeric vector of odd numbers less than 10:

odd_vect <- c(1.3, 3.3, 5.5, 7.7, 9.9)

This statement is saying, “combine these five numbers into a vector and call it odd_vect

If I wanted to create an integer (or whole number) vector, I need to follow each number with an L

The assignment operator also works in the other direction–something I didn’t learn until recently. Use it to create another numerical vector named even_vect of even integers less than or equal to 10.

c(2L, 4L, 6L, 8L, 10L) -> even_vect

The c() function works for combining separate numerical vectors, too.  Add these two variables together into a new vector called ten_vect and print the contents:

ten_vect <- c(odd_vect, even_vect)

ten_vect

[1] 1.3 3.5 5.1 7.7 9.1 2.0 4.0 6.0 8.0 10.0

The final numeric vector (ten_vect) has combined both the odd and even values into a single vector.

Atomic vectors 1.2 – logical & character vectors

Logical vectors return two possible values, TRUE or FALSE. We can use logic to interrogate vectors in order to discover their type.

For example, we can use is.numeric to figure out if the ten_vect vector we created ended up being numeric or integer.

is.numeric(ten_vect)

[1] TRUE

Why did the combination of a numerical and integer vector end up being numeric? This is referred to as coercion. When a less flexible data type (numeric) is combined with a more flexible data type (integer), the more flexible element is coerced into the less flexible type.

Atomic vector 1.3 – Character vectors

In R, character vectors contain text strings. We can use character vectors to construct a sentence using a combination of c() and <- functions.
We will start with a preposition
prep_vect <- c("In")

then include a noun

noun_vect <- c("the Brothers Karamazov,")

throw in a subject,

sub_vect <- c("Dmitri")

sprinkle in a verb,

verb_vect <- c("kills")

and finish with an object

obj_vect <- c("his father")

Sentence construction can be a great way to learn how vector objects are structured in R. Atomic vectors are always flat, so you can nest them all…

sent_vect <- c("In",c("the Brothers Karamazov,",c("Dmitri",c("kills",c("his father")))))

sent_vect

[1] "In"                      "the Brothers Karamazov," "Dmitri"                 
[4] "kills"                   "his father"

Or enter them directly:
c("In","the Brothers Karamazov", "Dmitri", "kills", "his father"

[1] "In"                      "the Brothers Karamazov," "Dmitri"                 
[4] "kills"                   "his father"

Both return the same result.

Finally, we can combine each part of the sentence together using paste:

sent_vect <- paste(prep_vect, noun_vect, sub_vect, verb_vect, obj_vect)

[1] "In the Brothers Karamazov, Dmitri kills his father"

Lists

Unlike vectors–which only contain elements of a single type–lists can contain elements of different types.

We will create a list that includes an integer vector (even_vect) a logical vector (TRUE,FALSE), a full sentence ( sent_vect ), a numerical vector (odd_vect), and we will call it, my_list

my_list <- list(even_vect, c(TRUE, FALSE), c(sent_vect), c(odd_vect))

We will look at the structure of our list using str

str(my_list)

List of 4
 $ : int [1:5] 2 4 6 8 10
 $ : logi [1:2] TRUE FALSE
 $ : chr "In the Brothers Karamazov, Dmitri kills his father"
 $ : num [1:5] 1.3 3.3 5.5 7.7 9.9

Lists are recursive–they can contain other lists.

lists_on <- list(list(list(list())))

str(lists_on)

List of 1
 $ :List of 1
 ..$ :List of 1
 .. ..$ : list()

This feature separates Lists from the Atomic vectors described above.

So there you have it! This how-to should give you some basics in R programming. You can save it as HTML, pdf, or Docx file for future reference.

2 comments

  1. cyamin · August 17, 2016

    Thanks for easing me into R. Had long been curious about R notebooks, but as a SAS, STATA person I wasn’t sure where to start. So far it’s pretty intuitive.

    A couple questions:
    To confirm –
    A “vector” is a list of items of the same element?
    A “list” is a list of items that may be different elements?
    Is a “vector” the same as an “atomic vector?”

    Looking forward to more,

    Cyrus

    Like

    • newsandnumbers · August 17, 2016

      No worries! I hope the tutorial wasn’t too boring.

      A “vector” is a list of items of the same element? -> an atomic vector contains elements of the same type.
      A “list” is a list of items that may be different elements? -> lists can contain elements of differing type
      Is a “vector” the same as an “atomic vector? Strictly speaking, both atomic vectors and lists fall under the umbrella of vectors.

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.