R Language Vectors vs. Arrays vs. Lists vs. Matrices vs. Data Frames

The R language has five basic data structures. In order from simplest to complex: vectors, lists, matrices, arrays, data frames. Even though R has been around for decades, I see many questions on the Internet about the differences between these five structures. The confusion is due to R’s bizarre naming that differs from all mainstream languages.


Briefly, and with some details left out:

A vector is what is called an array in all other programming languages except R — a collection of cells with a fixed size where all cells hold the same type (integers or characters or reals or whatever).

A list can hold items of different types and the list size can be increased on the fly. List contents can be accessed either by index (like mylist[[1]]) or by name (like mylist$age).

A matrix is a two-dimensional vector (fixed size, all cell types the same).

An array is a vector with one or more dimensions. So, an array with one dimension is (almost) the same as a vector. An array with two dimensions is (almost) the same as a matrix. An array with three or more dimensions is an n-dimensional array.

A data frame is called a table in most languages. Each column holds the same type, and the columns can have header names.

Example vector code:

v = c(1:3)  # a vector with [1.0 2.0 3.0]
cat(v, "\n\n")

v = vector(mode="integer", 4)  # [0 0 0 0]
cat(v, "\n\n")

v = c("a", "b", "x")
cat(v, "\n\n")

Example list code:

ls = list("a", 2.2)
ls[3] = as.integer(3)

cat(ls[[2]], "\n\n")

ls = list(name="Smith", age=22)
cat(ls$name, ":", ls$age)

Example matrix code:

m = matrix(0.0, nrow=2, ncol=3) # 2x3

Example array code:

arr = array(0.0, 3)  # [0.0 0.0 0.0]

arr = array(0.0, c(2,3))  # 2x3 matrix

arr = array(0.0, c(2,5,4)) # 2x5x4 n-array
# print(arr)  # 40 values displayed

Example data frame code:

people = c("Alex", "Barb", "Carl") # col 1
ages = c(19, 29, 39)  # col 2
df = data.frame(people, ages)  # create
names(df) = c("NAME", "AGE")  # headers
This entry was posted in R Language. Bookmark the permalink.