2.1 Introduction

2.1.1 Formalism

Let $$\mathbf{X}\in\mathbb{R}^{n\times p}$$ be an input matrix that consists of $$n$$ points in a $$p$$-dimensional space.

In other words, we have a database on $$n$$ objects, each of which being described by means of $$p$$ numerical features.

$\mathbf{X}= \left[ \begin{array}{cccc} x_{1,1} & x_{1,2} & \cdots & x_{1,p} \\ x_{2,1} & x_{2,2} & \cdots & x_{2,p} \\ \vdots & \vdots & \ddots & \vdots \\ x_{n,1} & x_{n,2} & \cdots & x_{n,p} \\ \end{array} \right]$

Recall that in supervised learning, apart from $$\mathbf{X}$$, we are also given the corresponding $$\mathbf{y}$$; with each input point $$\mathbf{x}_{i,\cdot}$$ we associate the desired output $$y_i$$.

In this chapter we are still interested in regression tasks; hence, we assume that each $$y_i$$ it is a real number, i.e., $$y_i\in\mathbb{R}$$.

Hence, our dataset is $$[\mathbf{X}\ \mathbf{y}]$$ – where each object is represented as a row vector $$[\mathbf{x}_{i,\cdot}\ y_i]$$, $$i=1,\dots,n$$:

$[\mathbf{X}\ \mathbf{y}]= \left[ \begin{array}{ccccc} x_{1,1} & x_{1,2} & \cdots & x_{1,p} & y_1\\ x_{2,1} & x_{2,2} & \cdots & x_{2,p} & y_2\\ \vdots & \vdots & \ddots & \vdots & \vdots\\ x_{n,1} & x_{n,2} & \cdots & x_{n,p} & y_n\\ \end{array} \right].$

2.1.2 Simple Linear Regression - Recap

In a simple regression task, we have assumed that $$p=1$$ – there is only one independent variable, denoted $$x_i=x_{i,1}$$.

We restricted ourselves to linear models of the form $$Y=f(X)=aX+b$$ that minimised the sum of squared residuals (SSR), i.e.,

$\min_{a,b\in\mathbb{R}} \sum_{i=1}^n \left( ax_i+b-y_i \right)^2.$

The solution is:

$\left\{ \begin{array}{rl} a^* = & \dfrac{ n \displaystyle\sum_{i=1}^n x_i y_i - \displaystyle\sum_{i=1}^n y_i \displaystyle\sum_{i=1}^n x_i }{ n \displaystyle\sum_{i=1}^n x_i x_i - \displaystyle\sum_{i=1}^n x_i\displaystyle\sum_{i=1}^n x_i }\\ b^* = & \dfrac{1}{n}\displaystyle\sum_{i=1}^n y_i - a^* \dfrac{1}{n} \displaystyle\sum_{i=1}^n x_i \\ \end{array} \right.$

Fitting in R:

library("ISLR") # Credit dataset
X <- as.numeric(Credit$Balance[Credit$Balance>0])
Y <- as.numeric(Credit$Rating[Credit$Balance>0])
f <- lm(Y~X) # Y~X is a formula, read: Y is a function of X
print(f)
##
## Call:
## lm(formula = Y ~ X)
##
## Coefficients:
## (Intercept)            X
##    226.4711       0.2661

plot(X, Y, col="#000000aa", las=1)
abline(f, col=2, lwd=3)