当前位置:网站首页>Performance of recommender algorithms on top-N recommendation tasks

Performance of recommender algorithms on top-N recommendation tasks

2022-06-22 13:37:00 A hundred years of literature have been written on the left sid

brief introduction

This article is about Koren Et al. RecSys2010 Conference papers at , The core idea of the article is

  1. It is proved that there is no strong correlation between error index and precision index , That is, those objective functions are RMSE And through optimization to minimize the algorithm , On the accuracy index precision and recall It's not necessarily good ;
  2. The article suggests removing the popular items when using test sets , Avoid making the accuracy of the calculation tend to non personalized methods .

In common business systems , Often appear to guess your favorite recommendation , It's often thought of as a top-N Recommend tasks , That is to find a few items that interest users . Commonly used indicators based on error measurement, such as RMSE Not suitable for evaluation top-N Recommend tasks , On the contrary, some precision indicators such as precision and recall Can be directly used to measure top-N The recommendation effect of .
In this paper, a large number of the most advanced recommendation algorithms have been tested . It was found that , To minimize RMSE The optimized algorithm is top-N The effect in the recommendation task is less than expected , namely RMSE The improvement in accuracy is often not translated into an improvement in accuracy , Even a simple non personalized algorithm can outperform some common methods in performance , And almost reach the precision of complex algorithm . Another discovery is , Few of the most popular projects will affect top-N Accuracy of recommended tasks . therefore , In judging a recommendation system in top-N When recommending effects in a task , Test sets should be carefully selected to avoid bias of accuracy metrics towards non personalized solutions . Last , This paper presents two new 、 And RMSE Independent collaborative filtering algorithm , stay top-N The recommendation task significantly surpasses other recommendation algorithms , And has some other practical advantages .

1 Introduce

A non personalized 、 Popularity based algorithms can only provide simple recommendations , One side , Because this recommendation system makes users feel bored and disappointed , And make users not interested , On the other hand , Because the purpose of content providers' investment in recommendation system is to improve the sales of unknown products , So it won't interest content providers . For this reason , After removing the particularly popular items , A series of additional experiments are carried out to evaluate the effect of the algorithm . As expected , Because it is more difficult to recommend success after removing popular items , The accuracy of all algorithms has decreased . then , The sorting of different algorithms is more in line with our expectations , That is, the ranking of non personalized methods has dropped . therefore , In judging a recommendation system in top-N When recommending effects in a task , The test set should be carefully selected to avoid serious errors in the accuracy index .

2 The test method

The test method used in this paper is not consistent with the current common methods .
First , Use the training set M Train the model , Then for the test set T By each user u Rated items i The calculation method is as follows :

  1. Random selection 1000 Users are not u Scored extra items ( this 1000 Items are not in the training set , Not in the test set ), Let's assume that 1000 Most of the users in the project are not interested ( Because I didn't give a score , It is reasonable to make this assumption )
  2. Predict this 1000 Projects and projects I The score
  3. For the predicted 1001 Sort the scores of items , use p Represents a project I Here 1001 Ranking from high to low in projects , Obviously, the ideal situation is p by 1
  4. By extracting... From high to low N A project , Construct a top-N List of recommendations , If p ≤ N p\leq N pN, Then even if the recommendation is successful , Count one hit

in addition , It should be noted that , Defined in this article recall It is different from the way we usually define it
r e c a l l ( N ) = # h i t s ∣ T ∣ recall(N)=\frac{\#hits}{\vert T\vert} recall(N)=T#hits
p r e c i s i o n ( N ) = # h i t s N ⋅ ∣ T ∣ = r e c a l l ( N ) N precision(N)=\frac{\# hits}{N \cdot \vert T \vert}=\frac{recall(N)}{N} precision(N)=NT#hits=Nrecall(N)

2.1 Popular items vs Long tail project

Most of the ratings focus on a small number of the most popular items , For example Netflix Data set ,33% The scoring data of is about 1.7% Popular items for , And in the Movielens In the same way 33% The scoring data of is concentrated in 5.5% The phenomenon of popular projects .

3 Collaborative algorithms

There are two methods based on collaborative filtering , Neighbor based method and hidden factor method .

3.1 Non personalized approach

The non personalized methods used in this article include :

  1. MovieAvg: Recommend the one with the highest average score N A project
  2. Top-Pop: Recommend the most popular , That is, the one with the most scoring data N A project

3.2 The nearest neighbor model

The nearest neighbor model in this paper uses CorNgbr(Correlation Neighborhood)
The similarity between items is used , Considering sparsity , So we define an attenuation factor λ 1 \lambda_1 λ1 And set to 100, The attenuation similarity is obtained d i j = n i j n i j + λ 1 d_{ij}=\frac{n_{ij}}{n_{ij}+\lambda_1} dij=nij+λ1nij, among n i j n_{ij} nij It's the project i And projects j The number of times scored by the same user , The implementation of this algorithm can refer to Recommendation system surprise Library Tutorial Medium KNNBasic Algorithm .
r ^ u i = b u i + ∑ j ∈ N u k ( i ) d i j ⋅ ( r u j − b u j ) ∑ j ∈ N u k ( i ) d i j \hat{r}_{ui} = b_{ui} + \frac{ \sum\limits_{j \in N^k_u(i)} d_{i j} \cdot (r_{uj} - b_{uj})} {\sum\limits_{j \in N^k_u(i)} d_{ij}} r^ui=bui+jNuk(i)dijjNuk(i)dij(rujbuj)
The model without denominator is called NNCosNgbr, Because the denominator is not added , Therefore, the model will give higher scores to projects with more neighbors .

3.3 Hidden factor model

This mainly involves three models

  1. AsySVD From another paper by the author , Interpretation of later meetings , Welcome to your attention
  2. SVD++ Namely Recommendation system surprise Library Tutorial Medium SVDpp Algorithm
  3. PureSVD Algorithm ( A new algorithm proposed by the author in this paper ) r ^ u i = r u ⋅ Q ⋅ q i T \hat r_{ui}=r_u \cdot Q \cdot q_i^T r^ui=ruQqiT It is worth noting that , Among them Q Not iteratively , But through R ^ = U ⋅ ∑ ⋅ Q T \hat R=U\cdot \sum \cdot Q^T R^=UQT Conduct SVD Singular value decomposition yields

4 result

Defined in this article recall and precision There is a proportional relationship , So the results show recall The result is actually precision Result .
stay Movielens On dataset ,recall The order from high to low is :PureSVD50、NNCosNgbr、SVD++、PureSVD150、TopPop、AsySVD、CorNgbr and MovieAvg. therefore , The best algorithm in accuracy is based on RMSE Of PureSVD and NNCosNgbr.
Remove Movielens The precision ranking after the popular items in the data set is PureSVD150、PureSVD50、SVD++、NNCosNgbr、AsySVD、CorNgbr、MovieAvg and TopPop, The results are more in line with expectations .
Netflix The results on the dataset are similar , But here's the thing , After removing the popular items ,CosNgbr It's very outstanding , It is worth noting that .

5 About PureSVD The discussion of the

PureSVD Excellent performance , And it's simple , That is, when training the model SVD Just break it down , Without iterative learning . meanwhile , The author points out the commonly used RMSE Disadvantages of the test method , That is, only pay attention to the scores provided by users , It is very suitable for RMSE Optimized algorithm . The top-N The test method is more practical , Consider those items that have not been scored yet .

6 Conclusion

Except for the common RMSE Outside the test method , We should also pay attention to top-N Accuracy evaluation index of .

原网站

版权声明
本文为[A hundred years of literature have been written on the left sid]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/173/202206221137547661.html