当前位置:网站首页>Performance of recommender algorithms on top-N recommendation tasks
Performance of recommender algorithms on top-N recommendation tasks
2022-06-22 13:37:00 【A hundred years of literature have been written on the left sid】
Interpretation of the thesis ——Performance of Recommender Algorithms on Top-N Recommendation Tasks
brief introduction
This article is about Koren Et al. RecSys2010 Conference papers at , The core idea of the article is
- It is proved that there is no strong correlation between error index and precision index , That is, those objective functions are RMSE And through optimization to minimize the algorithm , On the accuracy index precision and recall It's not necessarily good ;
- The article suggests removing the popular items when using test sets , Avoid making the accuracy of the calculation tend to non personalized methods .
In common business systems , Often appear to guess your favorite recommendation , It's often thought of as a top-N Recommend tasks , That is to find a few items that interest users . Commonly used indicators based on error measurement, such as RMSE Not suitable for evaluation top-N Recommend tasks , On the contrary, some precision indicators such as precision and recall Can be directly used to measure top-N The recommendation effect of .
In this paper, a large number of the most advanced recommendation algorithms have been tested . It was found that , To minimize RMSE The optimized algorithm is top-N The effect in the recommendation task is less than expected , namely RMSE The improvement in accuracy is often not translated into an improvement in accuracy , Even a simple non personalized algorithm can outperform some common methods in performance , And almost reach the precision of complex algorithm . Another discovery is , Few of the most popular projects will affect top-N Accuracy of recommended tasks . therefore , In judging a recommendation system in top-N When recommending effects in a task , Test sets should be carefully selected to avoid bias of accuracy metrics towards non personalized solutions . Last , This paper presents two new 、 And RMSE Independent collaborative filtering algorithm , stay top-N The recommendation task significantly surpasses other recommendation algorithms , And has some other practical advantages .
1 Introduce
A non personalized 、 Popularity based algorithms can only provide simple recommendations , One side , Because this recommendation system makes users feel bored and disappointed , And make users not interested , On the other hand , Because the purpose of content providers' investment in recommendation system is to improve the sales of unknown products , So it won't interest content providers . For this reason , After removing the particularly popular items , A series of additional experiments are carried out to evaluate the effect of the algorithm . As expected , Because it is more difficult to recommend success after removing popular items , The accuracy of all algorithms has decreased . then , The sorting of different algorithms is more in line with our expectations , That is, the ranking of non personalized methods has dropped . therefore , In judging a recommendation system in top-N When recommending effects in a task , The test set should be carefully selected to avoid serious errors in the accuracy index .
2 The test method
The test method used in this paper is not consistent with the current common methods .
First , Use the training set M Train the model , Then for the test set T By each user u Rated items i The calculation method is as follows :
- Random selection 1000 Users are not u Scored extra items ( this 1000 Items are not in the training set , Not in the test set ), Let's assume that 1000 Most of the users in the project are not interested ( Because I didn't give a score , It is reasonable to make this assumption )
- Predict this 1000 Projects and projects I The score
- For the predicted 1001 Sort the scores of items , use p Represents a project I Here 1001 Ranking from high to low in projects , Obviously, the ideal situation is p by 1
- By extracting... From high to low N A project , Construct a top-N List of recommendations , If p ≤ N p\leq N p≤N, Then even if the recommendation is successful , Count one hit
in addition , It should be noted that , Defined in this article recall It is different from the way we usually define it
r e c a l l ( N ) = # h i t s ∣ T ∣ recall(N)=\frac{\#hits}{\vert T\vert} recall(N)=∣T∣#hits
p r e c i s i o n ( N ) = # h i t s N ⋅ ∣ T ∣ = r e c a l l ( N ) N precision(N)=\frac{\# hits}{N \cdot \vert T \vert}=\frac{recall(N)}{N} precision(N)=N⋅∣T∣#hits=Nrecall(N)
2.1 Popular items vs Long tail project
Most of the ratings focus on a small number of the most popular items , For example Netflix Data set ,33% The scoring data of is about 1.7% Popular items for , And in the Movielens In the same way 33% The scoring data of is concentrated in 5.5% The phenomenon of popular projects .
3 Collaborative algorithms
There are two methods based on collaborative filtering , Neighbor based method and hidden factor method .
3.1 Non personalized approach
The non personalized methods used in this article include :
- MovieAvg: Recommend the one with the highest average score N A project
- Top-Pop: Recommend the most popular , That is, the one with the most scoring data N A project
3.2 The nearest neighbor model
The nearest neighbor model in this paper uses CorNgbr(Correlation Neighborhood)
The similarity between items is used , Considering sparsity , So we define an attenuation factor λ 1 \lambda_1 λ1 And set to 100, The attenuation similarity is obtained d i j = n i j n i j + λ 1 d_{ij}=\frac{n_{ij}}{n_{ij}+\lambda_1} dij=nij+λ1nij, among n i j n_{ij} nij It's the project i And projects j The number of times scored by the same user , The implementation of this algorithm can refer to Recommendation system surprise Library Tutorial Medium KNNBasic Algorithm .
r ^ u i = b u i + ∑ j ∈ N u k ( i ) d i j ⋅ ( r u j − b u j ) ∑ j ∈ N u k ( i ) d i j \hat{r}_{ui} = b_{ui} + \frac{ \sum\limits_{j \in N^k_u(i)} d_{i j} \cdot (r_{uj} - b_{uj})} {\sum\limits_{j \in N^k_u(i)} d_{ij}} r^ui=bui+j∈Nuk(i)∑dijj∈Nuk(i)∑dij⋅(ruj−buj)
The model without denominator is called NNCosNgbr, Because the denominator is not added , Therefore, the model will give higher scores to projects with more neighbors .
3.3 Hidden factor model
This mainly involves three models
- AsySVD From another paper by the author , Interpretation of later meetings , Welcome to your attention
- SVD++ Namely Recommendation system surprise Library Tutorial Medium SVDpp Algorithm
- PureSVD Algorithm ( A new algorithm proposed by the author in this paper ) r ^ u i = r u ⋅ Q ⋅ q i T \hat r_{ui}=r_u \cdot Q \cdot q_i^T r^ui=ru⋅Q⋅qiT It is worth noting that , Among them Q Not iteratively , But through R ^ = U ⋅ ∑ ⋅ Q T \hat R=U\cdot \sum \cdot Q^T R^=U⋅∑⋅QT Conduct SVD Singular value decomposition yields
4 result
Defined in this article recall and precision There is a proportional relationship , So the results show recall The result is actually precision Result .
stay Movielens On dataset ,recall The order from high to low is :PureSVD50、NNCosNgbr、SVD++、PureSVD150、TopPop、AsySVD、CorNgbr and MovieAvg. therefore , The best algorithm in accuracy is based on RMSE Of PureSVD and NNCosNgbr.
Remove Movielens The precision ranking after the popular items in the data set is PureSVD150、PureSVD50、SVD++、NNCosNgbr、AsySVD、CorNgbr、MovieAvg and TopPop, The results are more in line with expectations .
Netflix The results on the dataset are similar , But here's the thing , After removing the popular items ,CosNgbr It's very outstanding , It is worth noting that .
5 About PureSVD The discussion of the
PureSVD Excellent performance , And it's simple , That is, when training the model SVD Just break it down , Without iterative learning . meanwhile , The author points out the commonly used RMSE Disadvantages of the test method , That is, only pay attention to the scores provided by users , It is very suitable for RMSE Optimized algorithm . The top-N The test method is more practical , Consider those items that have not been scored yet .
6 Conclusion
Except for the common RMSE Outside the test method , We should also pay attention to top-N Accuracy evaluation index of .
边栏推荐
- Leetcode interval DP
- AcWing第54场周赛
- leetcode 32. Longest valid bracket
- SQL and Oracle statements for eliminating duplicate records
- Rigid demand of robot direction → personal thinking ←
- leetcode 11. Container with the most water
- 2017年度总结
- AcWing第53场周赛
- Implementation of connecting SQL server to Oracle server_ Including query implementation
- 性能相关指标
猜你喜欢
随机推荐
Starting Oracle under Linux
769. Max Chunks To Make Sorted
448. Find All Numbers Disappeared in an Array
openGauss数据库源码解析系列文章—— 密态等值查询技术详解
Locks in MySQL
461. Hamming Distance
693. Binary Number with Alternating Bits
Performance related indicators
Mysql中的锁
Ppt data collection methods and analysis skills
RCE&代码执行漏洞
342. Power of Four
leetcode 第 297 场周赛
[cloud native] event publishing and subscription in Nacos -- observer mode
使用SQLAlchemy进行组合分页查询
leetcode-并查集
测试方法论——数据驱动测试
Acwing week 53
934. Shortest Bridge
Testing methodology - data driven testing








