Primary mathematical techniques and algorithms used in recommendation systems
Matrix Factorization
Matrix factorization is a collaborative filtering technique that decomposes the user-item interaction matrix into lower-dimensional user and item latent factor matrices.
The goal is to find latent factors that capture the underlying preferences of users and the characteristics of items.
Example using PyTorch:
import torch
# User-item interaction matrix
interactions = torch.tensor([[4, 0, 0, 5, 1],
[0, 0, 0, 4, 0],
[0, 0, 0, 0, 2],
[2, 3, 0, 0, 0],
[0, 0, 5, 4, 0]])
# Number of latent factors
num_factors = 3
# Initialize user and item latent factor matrices
user_factors = torch.randn(interactions.shape[0], num_factors, requires_grad=True)
item_factors = torch.randn(interactions.shape[1], num_factors, requires_grad=True)
# Matrix factorization
predicted_ratings = torch.matmul(user_factors, item_factors.t())
# Loss function and optimization
loss_fn = torch.nn.MSELoss()
optimizer = torch.optim.SGD([user_factors, item_factors], lr=0.01)
# Training loop
for epoch in range(100):
optimizer.zero_grad()
loss = loss_fn(predicted_ratings[interactions > 0], interactions[interactions > 0])
loss.backward()
optimizer.step()
Singular Value Decomposition (SVD)
SVD is another matrix factorization technique used in collaborative filtering.
It decomposes the user-item interaction matrix into three matrices: U (user singular vectors), Σ (singular values), and V^T (item singular vectors).
Example using PyTorch:
import torch
# User-item interaction matrix
interactions = torch.tensor([[4, 0, 0, 5, 1],
[0, 0, 0, 4, 0],
[0, 0, 0, 0, 2],
[2, 3, 0, 0, 0],
[0, 0, 5, 4, 0]], dtype=torch.float)
# Perform SVD
U, S, V = torch.svd(interactions)
# Truncate singular values and vectors
num_factors = 3
U_truncated = U[:, :num_factors]
S_truncated = S[:num_factors]
V_truncated = V[:, :num_factors]
# Reconstruct the interaction matrix
reconstructed_interactions = torch.matmul(torch.matmul(U_truncated, torch.diag(S_truncated)), V_truncated.t())
Deep Learning
Deep learning techniques, such as neural networks, have gained popularity in recommendation systems. They can capture complex non-linear relationships between users and items and learn meaningful representations from raw data.
Example using PyTorch:
import torch
import torch.nn as nn
# User and item embeddings
num_users = 1000
num_items = 500
embedding_dim = 50
user_embeddings = nn.Embedding(num_users, embedding_dim)
item_embeddings = nn.Embedding(num_items, embedding_dim)
# Neural network model
class RecommenderNet(nn.Module):
def __init__(self, user_embeddings, item_embeddings):
super(RecommenderNet, self).__init__()
self.user_embeddings = user_embeddings
self.item_embeddings = item_embeddings
self.fc1 = nn.Linear(embedding_dim * 2, 128)
self.fc2 = nn.Linear(128, 1)
self.activation = nn.ReLU()
def forward(self, user_ids, item_ids):
user_embedding = self.user_embeddings(user_ids)
item_embedding = self.item_embeddings(item_ids)
concatenated = torch.cat((user_embedding, item_embedding), dim=1)
x = self.activation(self.fc1(concatenated))
x = self.fc2(x)
return x.squeeze()
model = RecommenderNet(user_embeddings, item_embeddings)
# Training loop
criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
for epoch in range(10):
# Forward pass
user_ids = torch.tensor([...]) # User IDs
item_ids = torch.tensor([...]) # Item IDs
ratings = torch.tensor([...]) # Corresponding ratings
predicted_ratings = model(user_ids, item_ids)
loss = criterion(predicted_ratings, ratings)
# Backward pass and optimization
optimizer.zero_grad()
loss.backward()
optimizer.step()
Factorization Machines (FM)
Factorization Machines are a generalisation of matrix factorization that can handle both sparse and dense feature vectors.
They model the interactions between features using factorised parameters.
Example using PyTorch:
import torch
import torch.nn as nn
# Factorization Machine model
class FactorizationMachine(nn.Module):
def __init__(self, num_features, embedding_dim):
super(FactorizationMachine, self).__init__()
self.embedding = nn.Embedding(num_features, embedding_dim)
self.linear = nn.Linear(num_features, 1)
def forward(self, x):
square_of_sum = torch.sum(self.embedding(x), dim=1) ** 2
sum_of_square = torch.sum(self.embedding(x) ** 2, dim=1)
interaction_term = 0.5 * torch.sum(square_of_sum - sum_of_square, dim=1, keepdim=True)
linear_term = self.linear(x)
output = interaction_term + linear_term
return output.squeeze()
num_features = 1000
embedding_dim = 50
model = FactorizationMachine(num_features, embedding_dim)
# Training loop
criterion = nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
for epoch in range(10):
# Forward pass
x = torch.tensor([...]) # Input features
y = torch.tensor([...]) # Target ratings
predicted_ratings = model(x)
loss = criterion(predicted_ratings, y)
# Backward pass and optimization
optimizer.zero_grad()
loss.backward()
optimizer.step()
These are just a few examples of the mathematical techniques and algorithms used in recommendation systems.
Other notable approaches include:
Bayesian Personalized Ranking (BPR)
Alternating Least Squares (ALS)
Gradient Boosting Machines (GBM)
Recurrent Neural Networks (RNNs) for sequential recommendations
Graph-based methods like GraphSAGE and LightGCN
The choice of technique depends on the specific requirements of the recommendation system, such as the type of data available, the scalability needs, and the desired level of personalisation.
Hybrid approaches that combine multiple techniques are also common to leverage the strengths of different methods.