当前位置：网站首页>[recommendation system learning] technology stack of recommendation system

[recommendation system learning] technology stack of recommendation system

2022-06-26 17:06:00 【CC‘s World】

Recommender system is a very large framework , There are many modules in it , A complete set of recommendation system , Not only will it involve recommending Algorithm Engineers 、 Background development engineer 、 data mining / Analysis Engineer 、NLP/CV Engineers and front end 、 Clients and even products 、 Operation and other support . As Algorithm Engineers , The technology stack that needs to be mastered is mainly in the two areas of algorithm and engineering , So this article will start from two aspects of algorithm and engineering , Combine the two to analyze some of the current mainstream recommended algorithm technology stack .

One 、 Algorithm

Start from the recommended system architecture , One method is to divide the whole recommendation system architecture into recall 、 Rough row 、 Fine discharge 、 rearrangement 、 Mixed row and other modules . Its decomposition method is from how a piece of data is produced , A process to complete the sequence of online services . Because in different links , We usually consider different algorithms , So from this point of view, we will study the mainstream algorithm technology stack of the recommendation system .

1.1 Portrait layer

The first is the material warehouse of the recommendation system , In this part , The algorithm is mainly reflected in how to draw a user portrait and a product portrait . This step is to recommend the infrastructure of the system architecture , Usually new users / Commodities come in , Or the whole material warehouse will be restarted regularly every week , Calculate the information , Label users , Calculate statistics , Do content understanding for the product, etc . The user portrait is easy to understand , For example, the user's age 、 Hobbies usually APP This information will be collected through the registration interface . There are many forms of commodity portraits , For example, Taobao mainly recommends commodities , Tiktok is mainly short video , So there are many material forms , Content 、 The quality difference is also large , Therefore, the content portraits are different from each other , The current mainstream will involve a multi-modal information content understanding .
Please add a picture description

Generally, the recommendation system will add a multi-modal content understanding . Let's take a short video as an example , Suppose the user takes a short video , Uploaded to the platform , From the perspective of recommendation , First of all, we have information about the author of this short video 、 length 、 The label chosen by the author for it 、 Timestamp this information . But this is not enough for recommendation , First of all, the label the author puts on may not accurately reflect the work , The reason may be that the semantic space of our model may be different from the author / The real world is inconsistent . Second, we need more dimensional features , For example, some users like to watch their little sister dance , Then I hope to judge whether there is a little sister in a video , This involves the cover image based on CV Content extraction or entire video extraction ; Another example is that the title of a work can generally reflect the theme information , In addition to the common use of many platforms “#” Add a label outside , We also hope to be able to extract information based on NLP Information about . There are more dimensions to consider ： Cover map is a multi-dimensional multimedia feature system , Including face recognition , Face embedding, label , Primary and secondary classification , video embedding Express , watermark ,OCR distinguish , clarity , Vulgar pornography , Sensitive information and other dimensions .

The main tasks involved here are CV Target detection for 、 Semantic segmentation and other tasks ,NLP Emotional analysis in 、 Abstract extraction 、 Natural language understanding and other tasks . However, this part of the algorithm team usually has a dedicated group , There is no need to recommend an algorithm engineer to be responsible for , They will have multimodal semantic tag output , The main forms are of various sizes Embedding. We only need to introduce these pre trained in our recommendation model Embedding.

1.1.1 Text understanding

This should be the most used modal information , Include item The title of the 、 Text 、OCR、 Comments and other data . Information of different granularity can also be generated , For example, text classification , The whole item Make a coarse-grained classification .

The typical algorithms here are ：RNN、TextCNN、FastText、Bert etc. ;

1.1.2 Keyword tags

Compared with text classification , Keywords are more granular information , It's often a mutil-hot In the form of , It will be right item Select the most appropriate keyword or tag in our tag library .

Typical algorithms here are ：TF-IDF、Bert、LSTM-CRF etc. .

1.1.3 Content understanding

In many scenarios , The recommended topics are videos or pictures , Far more than just recommended text , Video here / picture item In addition to the content of the text , More information actually comes from video / The picture content itself , Therefore, it is necessary to try to extract more abundant information from multiple modes . It mainly includes classification information 、 Cover image OCR Information about 、 Video tag information, etc

Typical algorithms here are ：TSN、RetinaFace、PSENet etc. .

1.1.4 Knowledge map

Knowledge map as a knowledge bearing system , It is used to connect internal and external keyword information and word relationship information ; The content portrait will integrate the original relationship information , And build a relational knowledge system that can be used in business , secondly , Rely on the entity relationship data generated by user behavior accumulated in the business , Label information required by users , Used to build the interest map of business knowledge , Based on isomorphic network and heterogeneous network representation learning and other core models , Output knowledge representation and expression , The abstract atlas is used for text recognition , Recommend semantic understanding , Interest expansion, reasoning and other scenarios , The cold start scenario directly used for interest reasoning has been proved to have good benefits .

There are algorithms in this area ：KGAT、RippleNet etc. .

1.2 Recall / Rough row

The recall phase of the recommendation system can be understood as based on the historical behavior data of users , Roughly select a batch of content to be recommended from a large amount of information for users , The process of selecting a small candidate set . Many technologies used in rough sorting coincide with recall , So put it together , Rough sorting is not necessary , Its function is to sort the recall results roughly , On the premise of certain accuracy , Further reduce the number of items sent back , This is the function of rough row .

Because recall is usually multi-channel recall , From a model perspective, there are many recall algorithms , This kind of recall is generally carried out at the point where the recall layer accounts for the majority , besides , There will also be exploratory recalls 、 Strategic operation recall 、 Social recall, etc . Next, we focus on model class recall .

1.2.1 Classic model recall

With the development of technology , stay Embedding Modeling recall based on is a trend of technology development . The paradigm of this recall is through some kind of algorithm , Yes user and item Mark them separately Embedding, then user And item Online KNN Calculate the latest collection result of real-time query as the recall result , Find matching items quickly . It should be noted that if the recall adopts the model recall method , The optimization goal should be consistent with the optimization goal of sorting , Otherwise it may be filtered out .

Typical algorithms in this area are ：FM、 Two towers DSSM、Multi-View DNN etc. .

1.2.2 Sequence model recall

The recommendation system mainly solves the problem of personalized recommendation based on users' implicit reading behavior , Some sequence models are learned based on neural network models Word2Vec Model , The following is based on RNN The language model of , The first to use the most Bert, These methods can be applied to recall learning .

The user is using APP Or website , Generally, there will be some behaviors against objects , For example, click on some interesting items , Collection or interaction , Or buy goods . The reason why ordinary users act on items , It often means that these items are in line with users' interests , And different types of behavior , May represent different degrees of interest . For example, buying is a behavior that can better represent users' interests than clicking . In the recall phase , How to type according to user behavior sequence embedding, A supervised model can be adopted , such as Next Item Prediction The prediction method of ; It can also be done in an unsupervised way , For example, as long as the items can be printed embedding, Can unsupervised integration of user behavior sequence content , for example Sum Pooling.

Typical algorithms in this area are ：CBOW、Skip-Gram、GRU、Bert etc. .

1.2.3 User sequence split

The above describes the sequence of items using user behavior , Play out user interest Embedding How to do it . however , Another reality is ： Users tend to be more interested , For example, it is possible to have fun at the same time 、 sports 、 Interested in collecting . These different interests can also be seen from the composition of items in the user behavior sequence , For example, most of the behavior sequences are entertainment , Some sports , A few collections, etc . So can you sequence user behavior into items , This different type of user interest segmentation , Instead of trying to get a user's interest in general Embedding Li ？ User multi interest splitting is to solve the problem of more detailed description of user interests .

Essentially , The sequence of user behavior is divided into multiple embedding On , In fact, it is a process similar to clustering , Is to put different Item, Cluster into different interest categories . Currently commonly used to split user interests embedding Methods , Mainly capsule network and Memory Network, But in theory , Many similar clustering methods should be effective , So you can replace it with your own clustering method .

Typical algorithms in this area are ：Multi-Interest Network with Dynamic Routing for Recommendation at Tmall etc. .

1.2.4 Knowledge map

The knowledge map has a unique advantage and value , That is, the interpretability of the recommendation results ; For example, recommend an item to the user , A reasonable explanation can be given in the knowledge map through the key correlation path of the item , This is very good for the interpretation of the recommendation results , Because in the final analysis, the knowledge map is a set of knowledge system that people encode to make themselves easy to understand , So people can easily understand the relationship between them . The interpretability of knowledge map is often associated with the graph path method , and Path Class method , Many experiments have proved , From a sort point of view , Is the worst kind of method , But it has a good effect on explicability , Therefore, we can often use the knowledge map to build an interpretable recall channel .

There are algorithms in this area ：KGAT、RippleNet etc. .

1.2.5 Graph model

In the recommendation system User and Item Related behaviors 、 demand 、 Attributes and social information have a natural graph structure , A complex heterogeneous graph can be used to represent the entire recommendation system . Figure neural network model recommendation is based on this idea , Encode the structural and semantic information contained in heterogeneous networks to nodes Embedding In the middle , And use the vector to make personalized recommendations . Knowledge map is actually a specific example of graph neural network , however , The knowledge map encodes static knowledge , Rather than the direct behavior data of users , It is far away from the specific application , This may be the main reason for the difference between the two in the recommendation field .

Typical algorithms in this area are ：GraphSAGE、PinSage etc. .

1.3 Fine discharge

Please add a picture description

1.3.1 Feature crossover model

In the early development of deep learning recommendation algorithm , Many papers focus on how to improve the ability of feature combination and crossover , This includes both implicit feature crossing Deep Crossing There are also explorations that use explicit feature crossover . In essence, it is hoped that the model can get rid of the artificial prior feature engineering , Implement an end-to-end set of models .

In the early recommendation system , Basically, the features are crossed manually , Often with the understanding and experience of the business , But it takes time and effort . So there are many studies in this field , from FM To GBDT+LR They are all feature intersections that introduce models for automation . Then there is the depth model , Although the depth model has a universal approximation theorem , But really want to use the potential of the model , Explicit feature crossing is essential .

The classical research work in this field includes ：DCN、DeepFM、xDeepFM etc. ;

1.3.2 Sequence model

In the recommendation system , The sequence of historical behavior is a very important feature . In sequence modeling , The main task goal is to get the user's interest vector at the moment （user interest vector）. How to describe the universality of users' interests , It is a big difficulty in the recommendation system , The research of user historical behavior sequence modeling has experienced from Pooling、RNN To attention、capsule Until then transformer The order of . In the sequence model , There are many subdivisions , For example, according to the length of user behavior, there are studies on the lifetime behavior sequence of users , There are also those who focus on current interests , There are also extractors that study how to extract sequence features , For example, research attention Or capsule network .

The typical research work in this field is ：DIN、DSIN、DIEN、SIM etc. ;

1.3.3 Multimodal information fusion

As we mentioned above, algorithm teams often use content portrait information , Existing based on CV It's also based on NLP Extracted information . It's very reasonable , We are shopping in Tiktok 、 Taobao is not only concerned about item The price of 、 brand , I will also pay attention to whether the little sister on the cover is good-looking 、 Is the title shocking enough . besides , In the cold start scenario , We don't have enough information to use , If you can use multimodal information , It can solve the problem of data sparsity to a great extent .

The traditional approach in multimodal information fusion is to use different modal information , adopt Embedding Technology is integrated into the model . In the field of recommendation , The mainstream approach is still a non end-to-end system , Extract multimodal information from other models , Recommendation only needs to incorporate this information . At the same time, other work is to use attention mechanism and other methods to learn the relationship between different modes , To enhance the multimodal representation .

Typical jobs are ：Image Matters: Visually modeling user behaviors using Advanced Model Server、UMPR etc. .

1.3.4 Multi task learning

In many scenarios, the goal of our model optimization is CTR, There are some scenarios that only consider CTR It's not enough. , Click through model 、 The duration model and the completion rate model are the models that most information flow product recommendation algorithm teams will try to build . It is easy to deduce the title of the party by optimizing the click through rate model alone , The optimized duration model alone may generate long videos or articles , If the completion rate model is optimized separately, it may be easy to push out short video and short text , So multi-objective came into being . Information flow recommendation , We not only want users to click into our item, I also hope to have a good completion rate , That is to say, we hope that users can finish reading our recommended products . Or the e-commerce scenario wants users not only to click in , I hope he bought or joined the shopping cart . These probabilities are actually the goals of the model , A combination of multiple goals , Including reading 、 give the thumbs-up 、 Collection 、 Sharing and so on , Sum it up into a model for learning , This is the multi-objective learning of recommendation system .

Typical algorithms in this respect are ：ESSM、MMoE、DUPN etc. .

1.3.5 Reinforcement learning

Reinforcement learning has some significant advantages over supervised deep learning , First, reinforcement learning can flexibly define optimized business objectives , Consider the long-term and short-term benefits of the recommendation system , Such as user retention , Under depth model , It is difficult for us to design the optimization function of this index , Reinforcement learning can model long-term benefits . The second is to reflect the dynamic changes of users' interests , For example, under the news recommendation , User interests change quickly , Reinforcement learning is easier to dynamically generate recommendation results through user behavior . And finally EE That is to use the exploration mechanism , This is a trade-off between current and long-term benefits , Reinforcement learning can better adjust the return here .

Typical algorithms in this respect are ：DQN、Reinforcement Learning for Slate-based Recommender Systems: A Tractable Decomposition and Practical Methodology;

1.3.6 Cross domain recommendation

Generally, a company has many business lines , For example, Tencent has Tencent video , There is also wechat to have a look 、 Video Number , And Tencent music , If we can combine the data of these scenarios , At the same time, make recommendations , On the one hand, it is very beneficial for cold start , On the other hand, it can also add more data , Better accurate recommendations .

The cross domain recommendation system is more complex than the general recommendation system . In traditional recommendation systems , We only need to consider establishing a recommendation model in the current field for analysis ; In cross domain recommendation , We should be more concerned about what information should be selected for migration between different fields , And how to migrate this information , This is a key problem in cross domain recommendation system .

Typical models in this regard are ：DTCDR、MV-DNN、EMCDR etc. ;

1.4 rearrangement

We know that there are three common optimization objectives ：Point Wise、Pair Wise and List Wise. The reordering stage produces a Top-N Reorder the sequence of items , Generate a Top-K A sequence of items , As the final result of the sorting system , Directly present to the user . The reason for reordering is that multiple items often interact with each other , Fine sorting is based on PointWise score , It is easy to cause serious homogenization of recommendation results , There's a lot of redundant information . The challenge of reordering is how to solve the massive state space , Generally, we use in the fine arrangement layer AUC As an indicator , But in reordering more attention NDCG Equal index .

Reorder in business , According to some strategies 、 Operation rules participate in sorting , Such as forced weight removal 、 Interval sort 、 Traffic support, etc , However, from the perspective of total trend, algorithm sorting is becoming more and more dominant . Reordering is more about List Wise As an optimization target , It focuses on the order of items in the list to optimize the model , But in general List Wise Because the state space is large , The training speed is slow . Typical practices in this regard , be based on RNN、Transformer、 There are all kinds of intensive learning , This aspect is not a core of recommendation , So there is no expansion , And this one depends on the actual business scenarios .

The classical algorithms here are ：MRR、DPP、RNN etc. ;

Two 、 engineering

The implementation of the recommendation system depends on the project , A lot of research Paper Of idea Exist everywhere , But it ignores whether the industry can be put into practice , It is difficult or rare for us to enter the industry to do pure research Of , So we also have a lot of engineering skills to master . The engineering technologies mainly used in the recommendation are listed below ：

programing language ：Python、Java（scala）、C++、sql、shell;
machine learning ：Tensorflow/Pytorch、GraphLab/GraphCHI、LGB/Xgboost、SKLearn;
Data analysis ：Pandas、Numpy、Seaborn、Spark;
data storage ：mysql、redis、mangodb、hive、kafka、es、hbase;
Similarity calculation ：annoy、faiss、kgraph
Flow calculation ：Spark Streaming、Flink
Distributed ：Hadoop、Spark

So many technologies above , The most important part of my content is the bold three parts , The first is language ： What must be mastered is Python,C++ and JAVA Different languages are used according to different groups in , If you don't have time, you can learn slowly after you join the group . Then there is the machine learning framework ：Tensorflow and Pytorch At least one must be mastered , You don't have to worry about what to learn in the early stage , The migration cost is very low , Basically, it can be understood by analogy , And the interviewer will not embarrass you. You only know this and not that . Finally, data analysis tools ：Pandas It is a powerful tool for us to process single machine scale data , But into industry ,Hadoop and Saprk It needs to be able to use , But don't learn too much , You can use it .