C.1 Creating Matrices

C.1.1 matrix()

A matrix can be created – amongst others – with a call to the matrix() function.

##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    4    5    6
## [1] "matrix"

Given a numeric vector of length 6, we’ve asked R to convert to a numeric matrix with 2 rows (the nrow argument). The number of columns has been deduced automatically (otherwise, we would additionally have to pass ncol=3 to the function).

Using mathematical notation, above we have defined \(\mathbf{A}\in\mathbb{R}^{2\times 3}\):

\[ \mathbf{A}= \left[ \begin{array}{ccc} a_{1,1} & a_{1,2} & a_{1,3} \\ a_{2,1} & a_{2,2} & a_{2,3} \\ \end{array} \right] = \left[ \begin{array}{ccc} 1 & 2 & 3 \\ 4 & 5 & 6 \\ \end{array} \right] \]

We can fetch the size of the matrix by calling:

## [1] 2 3

We can also “promote” a “flat” vector to a column vector, i.e., a matrix with one column by calling:

##      [,1]
## [1,]    1
## [2,]    2
## [3,]    3

C.1.2 Stacking Vectors

Other ways to create a matrix involve stacking a couple of vectors of equal lengths along each other:

##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    4    5    6
## [3,]    7    8    9
##      [,1] [,2] [,3]
## [1,]    1    4    7
## [2,]    2    5    8
## [3,]    3    6    9

These functions also allow for adding new rows/columns to existing matrices:

##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    4    5    6
## [3,]   -1   -2   -3
##      [,1] [,2] [,3] [,4]
## [1,]    1    2    3   -1
## [2,]    4    5    6   -2

C.1.3 Beyond Numeric Matrices

Note that logical matrices are possible as well. For instance, knowing that comparison such as < and == are performed elementwise also in the case of matrices, we can obtain:

##       [,1]  [,2] [,3]
## [1,] FALSE FALSE TRUE
## [2,]  TRUE  TRUE TRUE

Moreover, although much more rarely used, we can define character matrices:

##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] "A"  "C"  "E"  "G"  "I"  "K" 
## [2,] "B"  "D"  "F"  "H"  "J"  "L"

C.1.4 Naming Rows and Columns

Just like vectors could be equipped with names attribute:

## a b c 
## 1 2 3

matrices can be assigned row and column labels in the form of a list of two character vectors:

##   x y z
## a 1 2 3
## b 4 5 6

C.1.5 Other Methods

The read.table() (and its special case, read.csv()), can be used to read a matrix from a text file. We will cover it in the next chapter, because technically it returns a data frame object (which we can convert to a matrix with a call to as.matrix()).

outer() applies a given (vectorised) function on each pair of elements from two vectors, forming a two-dimensional “grid”. More precisely outer(x, y, f, ...) returns a matrix \(\mathbf{Z}\) with length(x) rows and length(y) columns such that \(z_{i,j}=f(x_i, y_j, ...)\), where ... are optional further arguments to f.

##      [,1] [,2] [,3] [,4] [,5]
## [1,]    1    2    3    4    5
## [2,]   10   20   30   40   50
## [3,]  100  200  300  400  500
##      [,1]  [,2]  [,3]  [,4]  [,5]  [,6]  [,7]  [,8] 
## [1,] "A-1" "A-2" "A-3" "A-4" "A-5" "A-6" "A-7" "A-8"
## [2,] "B-1" "B-2" "B-3" "B-4" "B-5" "B-6" "B-7" "B-8"

simplify2array() is an extension of the unlist() function. Given a list of vectors, each of length one, it will return an “unlisted” vector. However, if a list of equisized vectors of greater lengths is given, these will be converted to a matrix.

## [1]  1 11 21
##      [,1] [,2] [,3]
## [1,]    1   11   21
## [2,]    2   12   22
## [3,]    3   13   23
## [[1]]
## [1] 1
## 
## [[2]]
## [1] 11 12
## 
## [[3]]
## [1] 21 22 23

sapply(...) is a nice application of the above, meaning simplify2array(lapply(...)).

##     setosa versicolor  virginica 
##      5.006      5.936      6.588
##         setosa versicolor virginica
## Min.     4.300      4.900     4.900
## 1st Qu.  4.800      5.600     6.225
## Median   5.000      5.900     6.500
## Mean     5.006      5.936     6.588
## 3rd Qu.  5.200      6.300     6.900
## Max.     5.800      7.000     7.900

Of course, custom functions can also be applied:

##      setosa versicolor virginica
## min   4.300      4.900     4.900
## mean  5.006      5.936     6.588
## max   5.800      7.000     7.900

Lastly, table(x, y) creates a contingency matrix that counts the number of unique pairs of corresponding elements from two vectors of equal lengths.

## 
##   0   1 
## 549 342
## 
## female   male 
##    314    577
##    
##     female male
##   0     81  468
##   1    233  109

C.1.6 Internal Representation (*)

Note that by setting byrow=TRUE in a call to the matrix() function above, we are reading the elements of the input vector in the row-wise (row-major) fashion. The default is the column-major order, which might be a little unintuitive for some of us.

It turns out that is exactly the order in which the matrix is stored internally. Under the hood, it is an ordinary numeric vector:

## [1] "numeric"
## [1] 6
## [1] 1 4 2 5 3 6
## [1] 1 2 3 4 5 6

Also note that we can create a different view on the same underlying data vector:

##      [,1] [,2]
## [1,]    1    5
## [2,]    4    3
## [3,]    2    6
##      [,1] [,2]
## [1,]    1    4
## [2,]    2    5
## [3,]    3    6