当前位置:网站首页>[paper reading] gettext: trajectory flow map enhanced transformer for next POI recommendation
[paper reading] gettext: trajectory flow map enhanced transformer for next POI recommendation
2022-07-23 18:56:00 【EmoryHuang】
【 Paper reading 】GETNext: Trajectory Flow Map Enhanced Transformer for Next POI Recommendation
Preface
Next POI The recommendation is based on the current status and historical information of the user , Predict users' recent trends , Bring great value to users and service providers .
2022 year SIGIR A paper on :GETNext: Trajectory Flow Map Enhanced Transformer for Next POI Recommendation
Problem description
Given size is M M M Of users U = { u 1 , u 2 , ⋯ , u M } U=\{u_1, u_2, \cdots,u_M\} U={ u1,u2,⋯,uM} And the size is N N N Of POI aggregate P = { p 1 , p 2 , ⋯ , p N } P=\{p_1, p_2, \cdots, p_N \} P={ p1,p2,⋯,pN}. among p = * l a t i t u d e , l o n g i t u d e , c a t e g o r y , f r e q u e n c y * p=\langle latitude,longitude,category,frequency \rangle p=*latitude,longitude,category,frequency* Longitude respectively 、 latitude 、 Category and access frequency .
(check-in) One check-in It can be expressed as q = * u , p , t * ∈ U × P × T q=\langle u,p,t \rangle \in U\times P\times T q=*u,p,t*∈U×P×T, The user u u u stay t t t Always visit the place p p p.
For the current user u u u, His trajectory is S u = ( q 1 , q 2 , ⋯ , q m ) S_u=(q_1,q_2,\cdots,q_m) Su=(q1,q2,⋯,qm), General , Our task is to predict the user's next visit location , namely next POI(Point-of-Interest) recommendation.
Overview
This paper proposes a new model Graph Enhanced Transformer model(GETNext). The overall framework of the model is still Transformer. in addition , Build a global trajectory flow chart (global trajectory flow map) And use GCN To carry out POI Embedding. Then merge User Embedding、Category Embedding、Time Embedding(Time2Vec) As final input .
Main contributions :
- Came up with a Global trajectory flow graph (global trajectory flow map) To express POI Access sequence information , And use graph convolution network (GCN) Conduct POI The embedded .
- This paper proposes a new time aware category embedding (time-aware category embedding), To better represent time information .
- A new method based on Transformer Framework , Move global mode (global transition patterns)、 User preferences (user general tastes)、 User's recent track (user short term trajectory) Integrate with spatiotemporal information , Conduct POI recommend .
GETNext
The model architecture is shown in the figure below :
Trajectory Flow Map
The output of the trajectory diagram is mainly used for two parts :
- Using graph convolution networks GCN Conduct POI Embedding.
- Use the attention module Attention Generate a transition probability matrix (transition attention map)
The paper is also right Trajectory Flow Map It's visualized , Several dense areas can be obviously found .
POI Embedding
Paper points out , Different users may share some similar track clips , At the same time, the same person can repeat a track many times . That is, use the collective information from other users to form a continuous track . for example , The two users in the figure below visited the same restaurant and cinema , And the access order is also the same .
[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-kjY4t2Fj-1658547105921)(https://static.emoryhuang.cn/webp/4228286132-GETNext-2.webp)]
These trajectory flows can provide key information for users' general motion patterns , Help solve the problems of short tracks and inactive users .
(Trajectory Flow Map) Given a set of historical tracks S = { S u i } i ∈ N , u ∈ U \mathcal{S}=\{S_u^i\}_{i\in \mathbb{N},u\in U} S={ Sui}i∈N,u∈U,Trajectory Flow Map G = ( V , E , l , w ) \mathcal{G} = (V,E,\mathcal{l},\mathcal{w}) G=(V,E,l,w) Is a directed weighted graph , among :
- nodes aggregate V = P V=P V=P, P P P by POI aggregate ;
- p = * l a t i t u d e , l o n g i t u d e , c a t e g o r y , f r e q u e n c y * ∈ P p=\langle latitude,longitude,category,frequency \rangle\in P p=*latitude,longitude,category,frequency*∈P Longitude respectively 、 latitude 、 Category and access frequency ;
- If continuous access p 1 , p 2 p_1,p_2 p1,p2, Then add an edge ( p 1 , p 2 ) (p_1,p_2) (p1,p2);
- edge ( p 1 , p 2 ) (p_1,p_2) (p1,p2) The weight on is the frequency of this edge .
Just to summarize Trajectory Flow Map, This is a directed weighted graph , The points on the graph are each POI, Connect according to the user access track , The weight on the edge is the number of times the same clip track appears , node (POI) Record longitude on 、 latitude 、 Category and access frequency information .
Next, use graph convolution network GCN Formal POI Embeding. About GCN The specific principle of is not introduced here , If you are right about GCN Don't understand , You can see another article of mine : Simple understanding of graph neural network GNN.
Calculate the Laplace matrix and give the update equation of the hidden layer :
L ~ = ( D + I N ) − 1 ( A + I N ) \widetilde{\mathbf{L}}=(\mathbf{D}+\mathbf{I}_N)^{-1}(\mathbf{A}+\mathbf{I}_N) L=(D+IN)−1(A+IN)
H ( l ) = σ ( L ~ H ( l − 1 ) W ( l ) + b ( l ) ) \mathbf{H}^{(l)}=\sigma \left( \widetilde{\mathbf{L}}\mathbf{H}^{(l-1)}\mathbf{W}^{(l)}+b^{(l)} \right) H(l)=σ(LH(l−1)W(l)+b(l))
among D , A \mathbf{D},\mathbf{A} D,A Represent degree matrix and adjacency matrix respectively .
Personally, I feel there are some problems , In principle, it should be symmetrical normalization , That's what happened next :
L ~ = ( D + I N ) − 1 / 2 ( A + I N ) ( D + I N ) − 1 / 2 \widetilde{\mathbf{L}}=(\mathbf{D}+\mathbf{I}_N)^{-1/2}(\mathbf{A}+\mathbf{I}_N)(\mathbf{D}+\mathbf{I}_N)^{-1/2} L=(D+IN)−1/2(A+IN)(D+IN)−1/2
In each iteration ,GCN The layer updates the embedding of the node by aggregating the neighbor information of the node and the embedding of the node itself .
after l ∗ l^{*} l∗ After layer cycle , The output of the module can be expressed as :
e p = L ~ H ( l ∗ ) W ( l ∗ + 1 ) + b ( l ∗ + 1 ) ∈ R N × Ω \mathbf{e}_p=\widetilde{\mathbf{L}}\mathbf{H}^{(l^{*})}\mathbf{W}^{(l^{*}+1)}+b^{(l^{*}+1)} \in \mathbb{R}^{N\times \Omega} ep=LH(l∗)W(l∗+1)+b(l∗+1)∈RN×Ω
after GCN Then I got POI Vector representation of . It is worth noting that , Even if the current track is short ,POI Embedding still provides a wealth of information for prediction models .
Transition Attention Map
From the picture G \mathcal{G} G What I learned POI Embedding Only general behavior models are captured , In order to further amplify the influence of collective signals , This paper presents the transition probability matrix Φ \mathbf{\Phi} Φ To clarify from a POI To another POI The probability of transfer . say concretely :
Φ 1 = ( X × W 1 ) × a 1 ∈ R N × 1 \mathbf{\Phi}_1=(\mathbf{X} \times \mathbf{W}_1) \times \mathbf{a}_1 \in \mathbb{R}^{N\times 1} Φ1=(X×W1)×a1∈RN×1
Φ 2 = ( X × W 2 ) × a 2 ∈ R N × 1 \mathbf{\Phi}_2=(\mathbf{X} \times \mathbf{W}_2) \times \mathbf{a}_2 \in \mathbb{R}^{N\times 1} Φ2=(X×W2)×a2∈RN×1
Φ = ( Φ 1 × 1 T + 1 × Φ 2 T ) ⊙ ( L ~ + J N ) ∈ R N × N \mathbf{\Phi}=(\mathbf{\Phi}_1 \times \mathbf{1}^T + \mathbf{1} \times \mathbf{\Phi}_2^T) \odot (\widetilde{\mathbf{L}}+J_N) \in \mathbb{R}^{N\times N} Φ=(Φ1×1T+1×Φ2T)⊙(L+JN)∈RN×N
among X \mathbf{X} X Is the information contained in the nodes in the graph ( longitude 、 latitude 、 Category and access frequency ); W 1 , W 2 , a 1 , a 2 \mathbf{W}_1,\mathbf{W}_2,\mathbf{a}_1,\mathbf{a}_2 W1,W2,a1,a2 For trainable parameters .
This formula is not particularly understood .
Contextual Embedding Module
POI Embedding outside , The paper also introduces the characteristics of space-time and user preferences .
POI-User Embeddings Fusion
In the paper , take User Embedding and POI Embedding Connect , To express check-in Activities .
e u = f e m b e d ( u ) ∈ R Ω \mathbf{e}_u=f_{embed}(u)\in \mathbb{R}^{\Omega} eu=fembed(u)∈RΩ
e p , u = σ ( w p , u [ e p ; e u ] + b p , u ) ∈ R Ω × 2 \mathbf{e}_{p,u}=\sigma(\mathbf{w}_{p,u}[\mathbf{e}_p;\mathbf{e}_u]+b_{p,u})\in \mathbb{R}^{\Omega \times 2} ep,u=σ(wp,u[ep;eu]+bp,u)∈RΩ×2
among e u , e p \mathbf{e}_u,\mathbf{e}_p eu,ep respectively User Embedding and POI Embedding.
Time-Category Embeddings Fusion
in the light of Time Embedding, This paper uses Time2Vec Method , If you are right about Time2Vec Don't understand , You can see another article of mine :. Special , The day is divided into 48 Time slice , Every time slice 30 minute , The length is k + 1 k+1 k+1, say concretely :
e t [ i ] = { ω i t + φ i , if i = 0 sin ( ω i t + φ i ) if 1 ≤ i ≤ k \mathbf{e}_t[i]=\begin{cases} \omega_it+\varphi_i, &\text{if } i=0 \\ \sin(\omega_it+\varphi_i) &\text{if } 1\leq i\leq k \end{cases} et[i]={ ωit+φi,sin(ωit+φi)if i=0if 1≤i≤k
On the other hand , Due to the sparsity of data and noise , The paper will Category Embedding and Time Embedding Splicing , Explore POI Time pattern of categories , Not a single one POI.
e c = f e m b e d ( c ) ∈ R Ψ \mathbf{e}_c=f_{embed}(c)\in \mathbb{R}^{\Psi} ec=fembed(c)∈RΨ
e c , t = σ ( w c , t [ e t ; e c ] + b c , t ) ∈ R Ψ × 2 \mathbf{e}_{c,t}=\sigma(\mathbf{w}_{c,t}[\mathbf{e}_t;\mathbf{e}_c]+b_{c,t})\in \mathbb{R}^{\Psi \times 2} ec,t=σ(wc,t[et;ec]+bc,t)∈RΨ×2
After the above series of processing , We will check-in q = * p , u , t * q=\langle p,u,t \rangle q=*p,u,t* Into vectors e q = [ e p , u ; e c , t ] \mathbf{e}_q=[\mathbf{e}_{p,u};\mathbf{e}_{c,t}] eq=[ep,u;ec,t] As Transformer The input of .
Transformer Encoder and MLP Decoders
Transformer Encoder
The backbone network still uses Transformer, I won't introduce too much here . For an input sequence S u = ( q u 1 , q u 2 , ⋯ , q u k ) S_u=(q_u^1,q_u^2,\cdots,q_u^k) Su=(qu1,qu2,⋯,quk), We need to predict the next activity q u k + 1 q_u^{k+1} quk+1. Through the top check-in Embedding after , about q u i q_u^i qui You can get X [ 0 ] ∈ R k × d \mathcal{X}^{[0]}\in \mathbb{R}^{k \times d} X[0]∈Rk×d As Transformer The input of , among d d d by embedding dimension .
Then there are some familiar Attention operation :
S = X [ l ] W q ( X [ l ] W k ) T ∈ R k × k S=\mathcal{X}^{[l]}\mathbf{W}_q(\mathcal{X}^{[l]}\mathbf{W}_k)^T\in \mathbb{R}^{k\times k} S=X[l]Wq(X[l]Wk)T∈Rk×k
S i , j ′ = exp ( S i , j ) ∑ j = 1 d exp ( S i , j ) S_{i,j}'=\frac{\exp(S_{i,j})}{\sum_{j=1}^d\exp(S_{i,j})} Si,j′=∑j=1dexp(Si,j)exp(Si,j)
head 1 = S ′ X [ l ] W v ∈ R k × d / h \text{head}_1=S'\mathcal{X}^{[l]}\mathbf{W}_v\in \mathbb{R}^{k\times d/h} head1=S′X[l]Wv∈Rk×d/h
Multihead ( X [ l ] ) = [ head 1 ; ⋯ ; head h ] × W o ∈ R k × d \text{Multihead}(\mathcal{X}^{[l]})=[\text{head}_1;\cdots;\text{head}_h]\times \mathbf{W}_o\in\mathbb{R}^{k\times d} Multihead(X[l])=[head1;⋯;headh]×Wo∈Rk×d
Then there is the residual connection 、LayerNorm、FFN:
X attn [ l ] = LayerNorm ( X [ l ] + Multihead ( X [ l ] ) ) \mathcal{X}_{\text{attn}}^{[l]}=\text{LayerNorm}\left(\mathcal{X}^{[l]}+\text{Multihead}(\mathcal{X}^{[l]}) \right) Xattn[l]=LayerNorm(X[l]+Multihead(X[l]))
X F C [ l ] = ReLU ( W 1 X attn [ l ] + b 1 ) W 2 + b 2 ∈ R k × d \mathcal{X}_{FC}^{[l]}=\text{ReLU}(\mathbf{W}_1\mathcal{X}_{\text{attn}}^{[l]}+b_1)\mathbf{W}_2+b_2\in\mathbb{R}^{k\times d} XFC[l]=ReLU(W1Xattn[l]+b1)W2+b2∈Rk×d
X [ l + 1 ] = LayerNorm ( X attn [ l ] + X F C [ l ] ) ∈ R k × d \mathcal{X}^{[l+1]}=\text{LayerNorm}(\mathcal{X}_{\text{attn}}^{[l]}+\mathcal{X}_{FC}^{[l]})\in\mathbb{R}^{k\times d} X[l+1]=LayerNorm(Xattn[l]+XFC[l])∈Rk×d
MLP Decoders
adopt Transformer Encoder Then we get the output X [ l ∗ ] \mathcal{X}^{[l^*]} X[l∗], After that, the output is mapped to three... By multi-layer perceptron MLP heads:
Y ^ poi = X [ l ∗ ] W poi + b poi \hat{\mathbf{Y}}_{\text{poi}}=\mathcal{X}^{[l^*]}\mathbf{W}_{\text{poi}}+b_{\text{poi}} Y^poi=X[l∗]Wpoi+bpoi
Y ^ time = X [ l ∗ ] W time + b time \hat{\mathbf{Y}}_{\text{time}}=\mathcal{X}^{[l^*]}\mathbf{W}_{\text{time}}+b_{\text{time}} Y^time=X[l∗]Wtime+btime
Y ^ cat = X [ l ∗ ] W cat + b cat \hat{\mathbf{Y}}_{\text{cat}}=\mathcal{X}^{[l^*]}\mathbf{W}_{\text{cat}}+b_{\text{cat}} Y^cat=X[l∗]Wcat+bcat
among , W poi ∈ R d × N , W time ∈ R d × 1 , W cat ∈ R d × Γ \mathbf{W}_{\text{poi}}\in\mathbb{R}^{d\times N},\mathbf{W}_{\text{time}}\in\mathbb{R}^{d\times 1},\mathbf{W}_{\text{cat}}\in\mathbb{R}^{d\times \Gamma} Wpoi∈Rd×N,Wtime∈Rd×1,Wcat∈Rd×Γ They are learnable weights .
Special , about Y ^ poi \hat{\mathbf{Y}}_{\text{poi}} Y^poi At the same time, the probability transfer matrix obtained above (Transition Attention Map) Combined with it :
y ^ poi = Y ^ poi ( k ⋅ ) + Φ p k ⋅ ∈ R 1 × N \hat{\mathbf{y}}_{\text{poi}}=\hat{\mathbf{Y}}_{\text{poi}}^{(k\cdot)}+\Phi^{p_k\cdot}\in\mathbb{R}^{1\times N} y^poi=Y^poi(k⋅)+Φpk⋅∈R1×N
The paper believes that twice check-in The time interval between them may fluctuate greatly , Corresponding POI Categories may also be different , Users should be in the next 1 Hours and 5 Hours to receive different suggestions . So in predicting the next POI At the same time, it also predicts the next check-in Time 、 type . That is, three are used in the paper MLP heads Why .
Loss
Because three prediction results are calculated in the paper , Therefore, the final loss is the weighted sum :
L final = L poi + 10 × L time + L cat \mathcal{L}_{\text{final}}=\mathcal{L}_{\text{poi}}+10\times \mathcal{L}_{\text{time}}+\mathcal{L}_{\text{cat}} Lfinal=Lpoi+10×Ltime+Lcat
among , L poi , L cat \mathcal{L}_{\text{poi}}, \mathcal{L}_{\text{cat}} Lpoi,Lcat Use the cross entropy loss function to calculate , L time \mathcal{L}_{\text{time}} Ltime Use the root mean square loss function (MSE) Calculation . Because time after standardization ∈ [ 0 , 1 ] \in[0,1] ∈[0,1], Therefore, the final calculation of the loss is multiplied by 10 times .
experiment
Data sets :
- FourSquare:NYC,TKY
- Gowalla:CA
The evaluation index :
Acc @ k = 1 m ∑ i = 1 m 1 ( r a n k ≤ k ) \text{Acc}@k=\frac1m\sum_{i=1}^m\mathbb{1}(rank \leq k) Acc@k=m1i=1∑m1(rank≤k)
MRR = 1 m ∑ i = 1 m 1 r a n k \text{MRR}=\frac1m\sum_{i=1}^m\frac{1}{rank} MRR=m1i=1∑mrank1
Results
Inactive users and active users
The paper is based on the activity of users , According to the user check-in Sort by quantity , Analyze the impact of users with different levels of activity on the model :
Short trajectories and long trajectories
On the other hand , At the same time, the paper carries out experiments on the challenge under short trajectory :
[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-M9FGg3pl-1658547105922)(https://static.emoryhuang.cn/webp/4228286132-GETNext-6.webp)]
Here's how to remove trajectory flow map The experimental results of :
Ablation Experiment
summary
Finally, make a conclusion , The backbone network of this paper is still Transformer, The biggest change is through building POI Transfer weight graph between (trajectory flow map) And pass GCN Conduct POI Embedding; Last , At the same time, predict POI、 Time 、 Category , Strengthen the loss function .
Reference material
边栏推荐
- 【攻防世界WEB】难度四星12分进阶题:Cat
- LM393 low power dual voltage comparator parameters, pins, application details
- moxa串口服务器型号,moxa串口服务器产品配置说明
- LeetCode 剑指 Offer II 115.重建序列:图解 - 拓扑排序
- Flutter operation mode
- Deepstream learning notes (II): description of GStreamer and deepstream-test1
- [sharing game modeling model making skills] how ZBrush adjusts the brush size
- 多线程【全面学习 图文精讲】
- How to replace the double quotation marks of Times New Roman in word with the double quotation marks in Tahoma
- mysql的访问端口是什么意思_数据库端口是什么端口号
猜你喜欢

入门学习3D建模一般会遇到哪些问题?实在是太多了

Learn about spark project on nebulagraph

LM393 low power dual voltage comparator parameters, pins, application details
![[sharing game modeling model making skills] how ZBrush adjusts the brush size](/img/12/4c9be15266bd01c17d6aa761e6115f.png)
[sharing game modeling model making skills] how ZBrush adjusts the brush size

Navigation component of jetpack compose uses

Spark installation and startup
![Log framework [detailed learning]](/img/2f/2aba5d48e8a544eae0df763d458e84.png)
Log framework [detailed learning]

Modeling just learning is very confused. How to learn the next generation role modeling process?

怎么将word中的times new roman的双引号替换成宋体双引号

Opencv (13): brief introduction to cv2.findcontours, cv:: findcontours and description of cv2.findcontours function in various versions of opencv
随机推荐
ros(27):rosparam简单使用与一种通过launch传递参数不成功与解决
1259. Programmation dynamique de poignée de main disjointe
一文详解:TMP1750芯片 三通道线性LED驱动器
398. 随机数索引-哈希表法
Three things programmers want to do most | comics
Navigation component of jetpack compose uses
Google is improving the skin color performance in all products and practicing the concept of "image fairness"
DB9 serial port and RJ45 serial port
[2018] [paper notes] graphene FET and [1] - Types and principles of gfets, characteristics of gfets, applications and principles of gfets in terahertz
Opencv (13): brief introduction to cv2.findcontours, cv:: findcontours and description of cv2.findcontours function in various versions of opencv
[sharing game modeling model making skills] how ZBrush adjusts the brush size
Google正在改进所有产品中的肤色表现 践行“图像公平”理念
Jumpserver administrator account is locked
What is the current situation of the next generation industry? 90% of career changing modelers are learning this process
What is the use of tampermonkey?
【游戏建模模型制作全流程】3ds Max和ZBrush制作无线电接收器
Redis【超强超细 入门教程】
银行业如何实现数字化转型风口全速启航
LM393低功耗双电压比较器参数、引脚、应用详解
并非原创的原文路径【如有侵权 请原博主联系删除】