The R language has five basic data structures. In order from simplest to complex: vectors, lists, matrices, arrays, data frames. Even though R has been around for decades, I see many questions on the Internet about the differences between these five structures. The confusion is due to R’s bizarre naming that differs from all mainstream languages.
Briefly, and with some details left out:
A vector is what is called an array in all other programming languages except R — a collection of cells with a fixed size where all cells hold the same type (integers or characters or reals or whatever).
A list can hold items of different types and the list size can be increased on the fly. List contents can be accessed either by index (like mylist[]) or by name (like mylist$age).
A matrix is a two-dimensional vector (fixed size, all cell types the same).
An array is a vector with one or more dimensions. So, an array with one dimension is (almost) the same as a vector. An array with two dimensions is (almost) the same as a matrix. An array with three or more dimensions is an n-dimensional array.
A data frame is called a table in most languages. Each column holds the same type, and the columns can have header names.
Example vector code:
v = c(1:3) # a vector with [1.0 2.0 3.0] cat(v, "\n\n") v = vector(mode="integer", 4) # [0 0 0 0] cat(v, "\n\n") v = c("a", "b", "x") cat(v, "\n\n")
Example list code:
ls = list("a", 2.2) ls = as.integer(3) print(ls) cat(ls[], "\n\n") ls = list(name="Smith", age=22) cat(ls$name, ":", ls$age)
Example matrix code:
m = matrix(0.0, nrow=2, ncol=3) # 2x3 print(m)
Example array code:
arr = array(0.0, 3) # [0.0 0.0 0.0] print(arr) arr = array(0.0, c(2,3)) # 2x3 matrix print(arr) arr = array(0.0, c(2,5,4)) # 2x5x4 n-array # print(arr) # 40 values displayed
Example data frame code:
people = c("Alex", "Barb", "Carl") # col 1 ages = c(19, 29, 39) # col 2 df = data.frame(people, ages) # create names(df) = c("NAME", "AGE") # headers print(df)