## B.8 Lists

Numeric, logical and character vectors are atomic objects – each component is of the same type. Let’s take a look at what happens when we create an atomic vector out of objects of different types:

c("nine", FALSE, 7, TRUE)
##  "nine"  "FALSE" "7"     "TRUE"
c(FALSE, 7, TRUE, 7)
##  0 7 1 7

In each case, we get an object of the most “general” type which is able to represent our data.

On the other hand, R lists are generalised vectors. They can consist of arbitrary R objects, possibly of mixed types – also other lists.

### B.8.1 Creating Lists

Most commonly, we create a generalised vector by calling the list() function.

(l <- list(1:5, letters, runif(3)))
## []
##  1 2 3 4 5
##
## []
##   "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o"
##  "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z"
##
## []
##  0.9568333 0.4533342 0.6775706
mode(l)
##  "list"
class(l)
##  "list"
length(l)
##  3

There’s a more compact way to print a list on the console:

str(l)
## List of 3
##  $: int [1:5] 1 2 3 4 5 ##$ : chr [1:26] "a" "b" "c" "d" ...
##  $: num [1:3] 0.957 0.453 0.678 We can also convert an atomic vector to a list by calling: as.list(1:3) ## [] ##  1 ## ## [] ##  2 ## ## [] ##  3 ### B.8.2 Named Lists List, like other vectors, may be assigned a names attribute. names(l) <- c("a", "b", "c") l ##$a
##  1 2 3 4 5
##
## $b ##  "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" ##  "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z" ## ##$c
##  0.9568333 0.4533342 0.6775706

### B.8.3 Subsetting and Extracting From Lists

Applying a square brackets operator creates a sub-list, which is of type list as well.

l[-1]
## $b ##  "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" ##  "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z" ## ##$c
##  0.9568333 0.4533342 0.6775706
l[c("a", "c")]
## $a ##  1 2 3 4 5 ## ##$c
##  0.9568333 0.4533342 0.6775706
l
## $a ##  1 2 3 4 5 Note in the 3rd case we deal with a list of length one, not a numeric vector. To extract (dig into) a particular (single) element, we use double square brackets: l[] ##  1 2 3 4 5 l[["c"]] ##  0.9568333 0.4533342 0.6775706 The latter can equivalently be written as: l$c
##  0.9568333 0.4533342 0.6775706

### B.8.4 Common Operations

Lists, because of their generality (they can store any kind of object), have few dedicated operations. In particular, it neither makes sense to add, multiply, … two lists together nor to aggregate them.

However, if we wish to run some operation on each element, we can call list-apply:

(k <- list(x=runif(5), y=runif(6), z=runif(3))) # a named list
## $x ##  0.57263340 0.10292468 0.89982497 0.24608773 0.04205953 ## ##$y
##  0.3279207 0.9545036 0.8895393 0.6928034 0.6405068 0.9942698
##
## $z ##  0.6557058 0.7085305 0.5440660 lapply(k, mean) ##$x
##  0.3727061
##
## $y ##  0.7499239 ## ##$z
##  0.6361008

The above computes the mean of each of the three numeric vectors stored inside list k.

unlist() tries (it might not always be possible) to unwind a list to a simpler, atomic form:

unlist(lapply(k, mean))
##         x         y         z
## 0.3727061 0.7499239 0.6361008

Moreover, split(x, f) classifies elements in a vector x into subgroups defined by a factor (or an object coercible to) of the same length.

x <- c(  1,   2,   3,   4,   5,   6,   7,   8,   9,  10)
f <- c("a", "b", "a", "a", "c", "b", "b", "a", "a", "b")
split(x, f)
## $a ##  1 3 4 8 9 ## ##$b
##   2  6  7 10
##
## $c ##  5 This is very useful when combined with lapply() and unlist(). For instance, here are the mean sepal lengths for each of the three flower species in the famous iris dataset. unlist(lapply(split(iris$Sepal.Length, iris\$Species), mean))
##     setosa versicolor  virginica
##      5.006      5.936      6.588

By the way, if we take a look at the documentation of ?lapply, we will note that that this function is defined as lapply(X, FUN, ...). Here ... denotes the optional arguments that will be passed to FUN.

In other words, lapply(X, FUN, ...) returns a list Y of length length(X) such that Y[[i]] <- FUN(X[[i]], ...) for each i. For example, mean() has an additional argument na.rm that aims to remove missing values from the input vector. Compare the following:

t <- list(1:10, c(1, 2, NA, 4, 5))
unlist(lapply(t, mean))
##  5.5  NA
unlist(lapply(t, mean, na.rm=TRUE))
##  5.5 3.0