当前位置：网站首页>Performance of recommender algorithms on top-N recommendation tasks

Performance of recommender algorithms on top-N recommendation tasks

2022-06-22 13:37:00 【A hundred years of literature have been written on the left sid】

Interpretation of the thesis ——Performance of Recommender Algorithms on Top-N Recommendation Tasks

brief introduction
1 Introduce
2 The test method
- 2.1 Popular items vs Long tail project
3 Collaborative algorithms
4 result
5 About PureSVD The discussion of the
6 Conclusion

brief introduction

This article is about Koren Et al. RecSys2010 Conference papers at , The core idea of the article is

It is proved that there is no strong correlation between error index and precision index , That is, those objective functions are RMSE And through optimization to minimize the algorithm , On the accuracy index precision and recall It's not necessarily good ;
The article suggests removing the popular items when using test sets , Avoid making the accuracy of the calculation tend to non personalized methods .

In common business systems , Often appear to guess your favorite recommendation , It's often thought of as a top-N Recommend tasks , That is to find a few items that interest users . Commonly used indicators based on error measurement, such as RMSE Not suitable for evaluation top-N Recommend tasks , On the contrary, some precision indicators such as precision and recall Can be directly used to measure top-N The recommendation effect of .
In this paper, a large number of the most advanced recommendation algorithms have been tested . It was found that , To minimize RMSE The optimized algorithm is top-N The effect in the recommendation task is less than expected , namely RMSE The improvement in accuracy is often not translated into an improvement in accuracy , Even a simple non personalized algorithm can outperform some common methods in performance , And almost reach the precision of complex algorithm . Another discovery is , Few of the most popular projects will affect top-N Accuracy of recommended tasks . therefore , In judging a recommendation system in top-N When recommending effects in a task , Test sets should be carefully selected to avoid bias of accuracy metrics towards non personalized solutions . Last , This paper presents two new 、 And RMSE Independent collaborative filtering algorithm , stay top-N The recommendation task significantly surpasses other recommendation algorithms , And has some other practical advantages .

1 Introduce

A non personalized 、 Popularity based algorithms can only provide simple recommendations , One side , Because this recommendation system makes users feel bored and disappointed , And make users not interested , On the other hand , Because the purpose of content providers' investment in recommendation system is to improve the sales of unknown products , So it won't interest content providers . For this reason , After removing the particularly popular items , A series of additional experiments are carried out to evaluate the effect of the algorithm . As expected , Because it is more difficult to recommend success after removing popular items , The accuracy of all algorithms has decreased . then , The sorting of different algorithms is more in line with our expectations , That is, the ranking of non personalized methods has dropped . therefore , In judging a recommendation system in top-N When recommending effects in a task , The test set should be carefully selected to avoid serious errors in the accuracy index .

2 The test method

The test method used in this paper is not consistent with the current common methods .
First , Use the training set M Train the model , Then for the test set T By each user u Rated items i The calculation method is as follows ：

Random selection 1000 Users are not u Scored extra items （ this 1000 Items are not in the training set , Not in the test set ）, Let's assume that 1000 Most of the users in the project are not interested （ Because I didn't give a score , It is reasonable to make this assumption ）
Predict this 1000 Projects and projects I The score
For the predicted 1001 Sort the scores of items , use p Represents a project I Here 1001 Ranking from high to low in projects , Obviously, the ideal situation is p by 1
By extracting... From high to low N A project , Construct a top-N List of recommendations , If $p\leq N$ , Then even if the recommendation is successful , Count one hit

in addition , It should be noted that , Defined in this article recall It is different from the way we usually define it
$recall(N)=\frac{\#hits}{\vert T\vert}$
$precision(N)=\frac{\# hits}{N \cdot \vert T \vert}=\frac{recall(N)}{N}$

2.1 Popular items vs Long tail project

Most of the ratings focus on a small number of the most popular items , For example Netflix Data set ,33% The scoring data of is about 1.7% Popular items for , And in the Movielens In the same way 33% The scoring data of is concentrated in 5.5% The phenomenon of popular projects .

3 Collaborative algorithms

There are two methods based on collaborative filtering , Neighbor based method and hidden factor method .

3.1 Non personalized approach

The non personalized methods used in this article include ：

MovieAvg： Recommend the one with the highest average score N A project
Top-Pop： Recommend the most popular , That is, the one with the most scoring data N A project

3.2 The nearest neighbor model

The nearest neighbor model in this paper uses CorNgbr(Correlation Neighborhood)
The similarity between items is used , Considering sparsity , So we define an attenuation factor $\lambda_1$ And set to 100, The attenuation similarity is obtained $d_{ij}=\frac{n_{ij}}{n_{ij}+\lambda_1}$ , among $n_{ij}$ It's the project i And projects j The number of times scored by the same user , The implementation of this algorithm can refer to Recommendation system surprise Library Tutorial Medium KNNBasic Algorithm .
$\hat{r}_{ui} = b_{ui} + \frac{ \sum\limits_{j \in N^k_u(i)} d_{i j} \cdot (r_{uj} - b_{uj})} {\sum\limits_{j \in N^k_u(i)} d_{ij}}$
The model without denominator is called NNCosNgbr, Because the denominator is not added , Therefore, the model will give higher scores to projects with more neighbors .

3.3 Hidden factor model

This mainly involves three models

AsySVD From another paper by the author , Interpretation of later meetings , Welcome to your attention
SVD++ Namely Recommendation system surprise Library Tutorial Medium SVDpp Algorithm
PureSVD Algorithm （ A new algorithm proposed by the author in this paper ） $\hat r_{ui}=r_u \cdot Q \cdot q_i^T$ It is worth noting that , Among them Q Not iteratively , But through $\hat R=U\cdot \sum \cdot Q^T$ Conduct SVD Singular value decomposition yields

4 result

Defined in this article recall and precision There is a proportional relationship , So the results show recall The result is actually precision Result .
stay Movielens On dataset ,recall The order from high to low is ：PureSVD50、NNCosNgbr、SVD++、PureSVD150、TopPop、AsySVD、CorNgbr and MovieAvg. therefore , The best algorithm in accuracy is based on RMSE Of PureSVD and NNCosNgbr.
Remove Movielens The precision ranking after the popular items in the data set is PureSVD150、PureSVD50、SVD++、NNCosNgbr、AsySVD、CorNgbr、MovieAvg and TopPop, The results are more in line with expectations .
Netflix The results on the dataset are similar , But here's the thing , After removing the popular items ,CosNgbr It's very outstanding , It is worth noting that .

5 About PureSVD The discussion of the

PureSVD Excellent performance , And it's simple , That is, when training the model SVD Just break it down , Without iterative learning . meanwhile , The author points out the commonly used RMSE Disadvantages of the test method , That is, only pay attention to the scores provided by users , It is very suitable for RMSE Optimized algorithm . The top-N The test method is more practical , Consider those items that have not been scored yet .