## 9.2 Collaborative Filtering

### 9.2.1 Example

In user-based collaborative filtering, we seek users with similar preference profiles/rating patters.

“User A has similar behavioural patterns as user B, so we should suggest A what B likes.”

In item-based collaborative filtering, we seek items with similar (dis)likeability structure.

“Users who (dis)liked X also (dis)liked Y”.

. Apple Banana Sushi Spinach Orange
Anne 1 5 5 1
Beth 1 1 5 5 1
John 5 5 1
Kate 1 1 5 5 1
Mark 5 5 1 1 5
Sara ? 5 ? 5

Will Sara enjoy her spinach? Will Sara enjoy her apple?

R <- matrix(
c(
1, 5, 5, 0, 1,
1, 1, 5, 5, 1,
5, 5, 0, 1, 0,
1, 1, 5, 5, 1,
5, 5, 1, 1, 5,
0, 5, 0, 0, 5
), byrow=TRUE, nrow=6, ncol=5,
dimnames=list(
c("Anne", "Beth", "John", "Kate", "Mark", "Sara"),
c("Apple", "Banana", "Sushi", "Spinach", "Orange")
)
)

R
##      Apple Banana Sushi Spinach Orange
## Anne     1      5     5       0      1
## Beth     1      1     5       5      1
## John     5      5     0       1      0
## Kate     1      1     5       5      1
## Mark     5      5     1       1      5
## Sara     0      5     0       0      5

### 9.2.2 Similarity Measures

Assuming $$\mathbf{a},\mathbf{b}\in\mathbb{N}^k$$ (in our setting, $$k\in\{n,m\}$$), let $$S$$ be the following similarity measure between two vectors of identical lengths (representing ratings):

$S(\mathbf{a},\mathbf{b}) = \frac{ \sum_{i=1}^k a_i b_i }{ \sqrt{ \sum_{i=1}^k a_i^2 } \sqrt{ \sum_{i=1}^k b_i^2 } } \ge 0$

cosim <- function(a, b) sum(a*b)/sqrt(sum(a^2)*sum(b^2))

We call it the cosine similarity.

(*) Another frequently considered similarity measure is a version of the Pearson correlation coefficient that ignores all $$0$$-valued observations, see also the use argument to the cor() function.

### 9.2.3 User-Based Collaborative Filtering

User-based approaches involve comparing each user against every other user (pairwise comparisons of the rows in $$R$$). This yields a similarity degree between the $$i$$-th and the $$j$$-th user:

$s^U_{i,j} = S(\mathbf{r}_{i,\cdot},\mathbf{r}_{j,\cdot}).$

SU <- matrix(0, nrow=nrow(R), ncol=nrow(R),
dimnames=dimnames(R)[c(1,1)]) # and empty m*m matrix
for (i in 1:nrow(R)) {
for (j in 1:nrow(R)) {
SU[i,j] <- cosim(R[i,], R[j,])
}
}

round(SU, 2)
##      Anne Beth John Kate Mark Sara
## Anne 1.00 0.61 0.58 0.61 0.63 0.59
## Beth 0.61 1.00 0.29 1.00 0.39 0.19
## John 0.58 0.29 1.00 0.29 0.81 0.50
## Kate 0.61 1.00 0.29 1.00 0.39 0.19
## Mark 0.63 0.39 0.81 0.39 1.00 0.81
## Sara 0.59 0.19 0.50 0.19 0.81 1.00

In order to obtain the previously unobserved rating $$\hat{r}_{u,i}$$ using the user-based approach, we typically look for the $$K$$ most similar users and aggregate their corresponding scores (for some $$K\ge 1$$).

More formally, let $$\{U_{v_1},\dots,U_{v_K}\}\in\mathcal{U}\setminus\{U_u\}$$ be the set of users maximising $$s^U_{u, v_1}, \dots, s^U_{u, v_K}$$ and having $$r_{v_1, i},\dots,r_{v_K, i}>0$$. Then $\hat{r}_{u,i} = \frac{1}{K} \sum_{\ell=1}^K r_{v_\ell, i}.$

The arithmetic mean can be replaced with, e.g., the more or a weighted arithmetic mean where weights are proportional to $$s^U_{u, v_\ell}$$

This is similar to the $$K$$-nearest neighbour heuristic.

K <- 2
(sim <- order(SU["Sara",],decreasing=TRUE))
##  6 5 1 3 2 4
# sim gives the indices of people in decreasing order
# of similarity to Sara:
dimnames(R)[][sim] # the corresponding names
##  "Sara" "Mark" "Anne" "John" "Beth" "Kate"
# Remove those who haven't tried Spinach yet (including Sara):
sim <- sim[ R[sim, "Spinach"]>0 ]
dimnames(R)[][sim]
##  "Mark" "John" "Beth" "Kate"
# aggregate the Spinach ratings of the K most similar people:
mean(R[sim[1:K], "Spinach"])
##  1

### 9.2.4 Item-Based Collaborative Filtering

Item-based schemes rely on pairwise comparisons between the items (columns in $$R$$). Hence, a similarity degree between the $$i$$-th and the $$j$$-th item is given by:

$s^I_{i,j} = S(\mathbf{r}_{\cdot,i},\mathbf{r}_{\cdot,j}).$

SI <- matrix(0, nrow=ncol(R), ncol=ncol(R),
dimnames=dimnames(R)[c(2,2)]) # an empty n*n matrix
for (i in 1:ncol(R)) {
for (j in 1:ncol(R)) {
SI[i,j] <- cosim(R[,i], R[,j])
}
}

round(SI, 2)
##         Apple Banana Sushi Spinach Orange
## Apple    1.00   0.78  0.32    0.38   0.53
## Banana   0.78   1.00  0.45    0.27   0.78
## Sushi    0.32   0.45  1.00    0.81   0.32
## Spinach  0.38   0.27  0.81    1.00   0.29
## Orange   0.53   0.78  0.32    0.29   1.00

In order to obtain the previously unobserved rating $$\hat{r}_{u,i}$$ using the item-based approach, we typically look for the $$K$$ most similar items and aggregate their corresponding scores (for some $$K\ge 1$$)

More formally, let $$\{I_{j_1},\dots,I_{j_K}\}\in\mathcal{I}\setminus\{I_i\}$$ be the set of items maximising $$s^I_{i, j_1}, \dots, s^I_{i, j_K}$$ and having $$r_{u, j_1},\dots,r_{u, j_K}>0$$. Then $\hat{r}_{u,i} = \frac{1}{K} \sum_{\ell=1}^K r_{u, j_\ell}.$

The arithmetic mean can be replaced with, e.g., a weighted arithmetic mean where weights are proportional to $$s^I_{i, j_\ell}$$ or the mode.

K <- 2
(sim <- order(SI["Apple",],decreasing=TRUE))
##  1 2 5 4 3
# sim gives the indices of items in decreasing order
# of similarity to Apple:
dimnames(R)[][sim] # the corresponding item types
##  "Apple"   "Banana"  "Orange"  "Spinach" "Sushi"
# Remove these which Sara haven't tried yet (e.g.,Apples):
sim <- sim[ R["Sara", sim]>0 ]
dimnames(R)[][sim]
##  "Banana" "Orange"
# aggregate Sara's ratings of the K most similar items:
mean(R["Sara", sim[1:K]])
##  5