2.1 Introduction

2.1.1 Formalism


Let \(\mathbf{X}\in\mathbb{R}^{n\times p}\) be an input matrix that consists of \(n\) points in a \(p\)-dimensional space.

In other words, we have a database on \(n\) objects, each of which being described by means of \(p\) numerical features.

\[ \mathbf{X}= \left[ \begin{array}{cccc} x_{1,1} & x_{1,2} & \cdots & x_{1,p} \\ x_{2,1} & x_{2,2} & \cdots & x_{2,p} \\ \vdots & \vdots & \ddots & \vdots \\ x_{n,1} & x_{n,2} & \cdots & x_{n,p} \\ \end{array} \right] \]


Recall that in supervised learning, apart from \(\mathbf{X}\), we are also given the corresponding \(\mathbf{y}\); with each input point \(\mathbf{x}_{i,\cdot}\) we associate the desired output \(y_i\).

In this chapter we are still interested in regression tasks; hence, we assume that each \(y_i\) it is a real number, i.e., \(y_i\in\mathbb{R}\).

Hence, our dataset is \([\mathbf{X}\ \mathbf{y}]\) – where each object is represented as a row vector \([\mathbf{x}_{i,\cdot}\ y_i]\), \(i=1,\dots,n\):

\[ [\mathbf{X}\ \mathbf{y}]= \left[ \begin{array}{ccccc} x_{1,1} & x_{1,2} & \cdots & x_{1,p} & y_1\\ x_{2,1} & x_{2,2} & \cdots & x_{2,p} & y_2\\ \vdots & \vdots & \ddots & \vdots & \vdots\\ x_{n,1} & x_{n,2} & \cdots & x_{n,p} & y_n\\ \end{array} \right]. \]

2.1.2 Simple Linear Regression - Recap


In a simple regression task, we have assumed that \(p=1\) – there is only one independent variable, denoted \(x_i=x_{i,1}\).

We restricted ourselves to linear models of the form \(Y=f(X)=aX+b\) that minimised the sum of squared residuals (SSR), i.e.,

\[ \min_{a,b\in\mathbb{R}} \sum_{i=1}^n \left( ax_i+b-y_i \right)^2. \]

The solution is:

\[ \left\{ \begin{array}{rl} a^* = & \dfrac{ n \displaystyle\sum_{i=1}^n x_i y_i - \displaystyle\sum_{i=1}^n y_i \displaystyle\sum_{i=1}^n x_i }{ n \displaystyle\sum_{i=1}^n x_i x_i - \displaystyle\sum_{i=1}^n x_i\displaystyle\sum_{i=1}^n x_i }\\ b^* = & \dfrac{1}{n}\displaystyle\sum_{i=1}^n y_i - a^* \dfrac{1}{n} \displaystyle\sum_{i=1}^n x_i \\ \end{array} \right. \]


Fitting in R:

## 
## Call:
## lm(formula = Y ~ X)
## 
## Coefficients:
## (Intercept)            X  
##    226.4711       0.2661

plot of chunk unnamed-chunk-3