## C.2 Algebraic Operations

### C.2.1 Matrix Transpose

The matrix transpose is denoted with $$\mathbf{A}^T$$:

t(A)
##      [,1] [,2] [,3]
## [1,]    1    4    2
## [2,]    5    3    6

Hence, $$\mathbf{B}=\mathbf{A}^T$$ is a matrix such that $$b_{i,j}=a_{j,i}$$.

In other words, in the transposed matrix, rows become columns and columns become rows. For example:

$\mathbf{A}= \left[ \begin{array}{ccc} a_{1,1} & a_{1,2} & a_{1,3} \\ a_{2,1} & a_{2,2} & a_{2,3} \\ \end{array} \right] \qquad \mathbf{A}^T= \left[ \begin{array}{cc} a_{1,1} & a_{2,1} \\ a_{1,2} & a_{2,2} \\ a_{1,3} & a_{2,3} \\ \end{array} \right]$

### C.2.2 Matrix-Scalar Operations

Operations such as $$s\mathbf{A}$$ (multiplication of a matrix by a scalar), $$-\mathbf{A}$$, $$s+\mathbf{A}$$ etc. are applied on each element of the input matrix:

(A <- matrix(c(1, 2, 3, 4, 5, 6), byrow=TRUE, nrow=2))
##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    4    5    6
(-1)*A
##      [,1] [,2] [,3]
## [1,]   -1   -2   -3
## [2,]   -4   -5   -6

In R, the same rule holds when we compute other operations (despite the fact that, mathematically, e.g., $$\mathbf{A}^2$$ or $$\mathbf{A}\ge 0$$ might have a different meaning):

A^2 # this is not A-matrix-multiply-A, see below
##      [,1] [,2] [,3]
## [1,]    1    4    9
## [2,]   16   25   36
A>=3
##       [,1]  [,2] [,3]
## [1,] FALSE FALSE TRUE
## [2,]  TRUE  TRUE TRUE

### C.2.3 Matrix-Matrix Operations

If $$\mathbf{A},\mathbf{B}\in\mathbb{R}^{n\times p}$$ are two matrices of identical sizes, then $$\mathbf{A}+\mathbf{B}$$ and $$\mathbf{A}-\mathbf{B}$$ are understood elementwise, i.e., they result in $$\mathbf{C}\in\mathbb{R}^{n\times p}$$ such that $$c_{i,j}=a_{i,j}\pm b_{i,j}$$.

A-A
##      [,1] [,2] [,3]
## [1,]    0    0    0
## [2,]    0    0    0

In R (but not when we use mathematical notation), all other arithmetic, logical and comparison operators are also applied in an elementwise fashion.

A*A
##      [,1] [,2] [,3]
## [1,]    1    4    9
## [2,]   16   25   36
(A>2) & (A<=5)
##       [,1]  [,2]  [,3]
## [1,] FALSE FALSE  TRUE
## [2,]  TRUE  TRUE FALSE

### C.2.4 Matrix Multiplication (*)

Mathematically, $$\mathbf{A}\mathbf{B}$$ denotes the matrix multiplication. It is a very different operation to the elementwise multiplication.

(A <- rbind(c(1, 2), c(3, 4)))
##      [,1] [,2]
## [1,]    1    2
## [2,]    3    4
(I <- rbind(c(1, 0), c(0, 1)))
##      [,1] [,2]
## [1,]    1    0
## [2,]    0    1
A %*% I # matrix multiplication
##      [,1] [,2]
## [1,]    1    2
## [2,]    3    4

This is not the same as the elementwise A*I.

Matrix multiplication can only be performed on two matrices of compatible sizes – the number of columns in the left matrix must match the number of rows in the right operand.

Given $$\mathbf{A}\in\mathbb{R}^{n\times p}$$ and $$\mathbf{B}\in\mathbb{R}^{p\times m}$$, their multiply is a matrix $$\mathbf{C}=\mathbf{A}\mathbf{B}\in\mathbb{R}^{n\times m}$$ such that $$c_{i,j}$$ is the dot product of the $$i$$-th row in $$\mathbf{A}$$ and the $$j$$-th column in $$\mathbf{B}$$: $c_{i,j} = \mathbf{a}_{i,\cdot} \cdot \mathbf{b}_{\cdot,j} = \sum_{k=1}^p a_{i,k} b_{k, j}$ for $$i=1,\dots,n$$ and $$j=1,\dots,m$$.

As an exercise, we recommend multiplying a few simple matrices of sizes $$2\times 2$$, $$2\times 3$$, $$3\times 2$$ etc. using pen and paper and checking the results in R.

Also remember that, mathematically, squaring a matrix is done in terms of matrix multiplication, i.e., $$\mathbf{A}^2 = \mathbf{A}\mathbf{A}$$. It can only be performed on square matrices, i.e., ones with the same number of rows and columns. This is again different than R’s elementwise A^2.

Note that $$\mathbf{A}^T \mathbf{A}$$ gives the matrix that consists of the dot products of all the pairs of columns in $$\mathbf{A}$$.

crossprod(A) # same as t(A) %*% A
##      [,1] [,2]
## [1,]   10   14
## [2,]   14   20

In one of the chapters on Regression, we note that the Pearson linear correlation coefficient can be beautifully expressed this way.

### C.2.5 Matrix-Vector Operations

Mathematically, there is no generally agreed upon convention defining arithmetic operations between matrices and vectors.

(*) The only exception is the matrix – vector multiplication in the case where an argument is a column or a row vector, i.e., in fact, a matrix. Hence, given $$\mathbf{A}\in\mathbb{R}^{n\times p}$$ we may write $$\mathbf{A}\mathbf{x}$$ only if $$\mathbf{x}\in\mathbb{R}^{p\times 1}$$ is a column vector. Similarly, $$\mathbf{y}\mathbf{A}$$ makes only sense whenever $$\mathbf{y}\in\mathbb{R}^{1\times n}$$ is a row vector.

Please take notice of the fact that we consistently discriminate between different bold math fonts and letter cases: $$\mathbf{X}$$ is a matrix, $$\mathbf{x}$$ is a row or column vector (still a matrix, but a sequence-like one) and $$\boldsymbol{x}$$ is an ordinary vector (one-dimensional sequence).

However, in R, we might sometimes wish to vectorise an arithmetic operation between a matrix and a vector in a row- or column-wise fashion. For example, if $$\mathbf{A}\in\mathbb{R}^{n\times p}$$ is a matrix and $$\mathbf{m}\in\mathbb{R}^{1\times p}$$ is a row vector, we might want to subtract $$m_i$$ from each element in the $$i$$-th column. Here, the apply() function comes in handy:

• apply(A, 1, f) applies a given function $$f$$ on each row of $$\mathbf{A}$$.
• apply(A, 2, f) applies a given function $$f$$ on each column of $$\mathbf{A}$$.

Usually, either $$f$$ returns a single value (when we wish to aggregate all the elements in a row/column) or returns the same number of values (when we wish to transform a row/column).

Example: to create a centred version of a given matrix, we need to subtract from each element the arithmetic mean of its column.

(A <- cbind(c(1, 2), c(2, 4), c(5, 8)))
##      [,1] [,2] [,3]
## [1,]    1    2    5
## [2,]    2    4    8
(m <- apply(A, 2, mean)) # same as colMeans(A)
## [1] 1.5 3.0 6.5
t(apply(A, 1, function(r) r-m)) # note the transpose here
##      [,1] [,2] [,3]
## [1,] -0.5   -1 -1.5
## [2,]  0.5    1  1.5

The above is equivalent to:

apply(A, 2, function(c) c-mean(c))
##      [,1] [,2] [,3]
## [1,] -0.5   -1 -1.5
## [2,]  0.5    1  1.5