## 3.4 Outro

### 3.4.1 Remarks

Note that K-NN is suitable for any kind of multiclass classification.

Some algorithms we are going to discuss in the next part are restricted to binary (0/1) outputs. They will have to be extended somehow to allow for more classes.

In the next part we will try to answer the question of how to choose the best $$K$$, and hence how to evaluate and pick the best model.

We will also discuss some other noteworthy classifiers:

• Decision trees
• Logistic regression

### 3.4.2 Side Note: K-NN Regression

The K-Nearest Neighbour scheme is intuitively pleasing.

No wonder it has inspired a similar approach for solving a regression task.

In order to make a prediction for a new point $$\mathbf{x}'$$:

1. find the K-nearest neighbours of $$\mathbf{x}'$$ amongst the points in the train set, denoted $$\mathbf{x}_{i_1,\cdot}, \dots, \mathbf{x}_{i_K,\cdot}$$,
2. fetch the corresponding reference outputs $$y_{i_1}, \dots, y_{i_K}$$,
3. return their arithmetic mean as a result, $\hat{y}=\frac{1}{K} \sum_{j=1}^K y_{i_j}.$

Recall our modelling of the Credit Rating ($$Y$$) as a function of the average Credit Card Balance ($$X$$) based on the ISLR::Credit data set.

library("ISLR") # Credit dataset
Xc <- as.matrix(as.numeric(Credit$Balance[Credit$Balance>0]))
Yc <- as.matrix(as.numeric(Credit$Rating[Credit$Balance>0]))
library("FNN") # knn.reg function
x <- as.matrix(seq(min(Xc), max(Xc), length.out=101))
y1  <- knn.reg(Xc, x, Yc, k=1)$pred y5 <- knn.reg(Xc, x, Yc, k=5)$pred
y25 <- knn.reg(Xc, x, Yc, k=25)\$pred

plot(Xc, Yc, las=1, col="#666666c0",
xlab="Balance", ylab="Rating")
lines(x, y1,  col=2, lwd=3)
lines(x, y5,  col=3, lwd=3)
lines(x, y25, col=4, lwd=3)