当前位置:网站首页>[paper reading | depth] role based network embedding via structural features reconstruction with degree regulated
[paper reading | depth] role based network embedding via structural features reconstruction with degree regulated
2022-06-25 10:20:00 【Haihong Pro】
Catalog
Preface
Hello!
Thank you very much for reading Haihong's article , If there are mistakes in the text , You are welcome to point out ~
Self introduction. ଘ(੭ˊᵕˋ)੭
nickname : Sea boom
label : Program the ape |C++ player | Student
brief introduction : because C Language and programming , Then I turned to computer science , Won a National Scholarship , I was lucky to win some national awards in the competition 、 Provincial award … It has been insured .
Learning experience : Solid foundation + Take more notes + Knock more code + Think more + Learn English well !
Only hard work
Know what it is Know why !
This article only records what you are interested in
ABSTRACT
Role-based network embedding aims to map the network into a low dimensional node representation , While maintaining structural similarity .
Adjacency matrix contains local information of the network , It also contains the global information of the network , But it can not directly express the role of nodes .
therefore , Extract high-order structural features from adjacency matrix For role-based network embedding ( Structural equivalence ) crucial .
Generally speaking , In real networks , Features extracted in the same way Sensitive to noise , But it doesn't really represent the role of nodes .
therefore , We propose a deep learning framework RESD
First , We propose to extract the high-order structural features of each node in the network
then , We use variational self coding (VAE) Modeling the nonlinear relationship of features , Reduce noise , Improve the robustness of embedding
Besides , In embedded space , We apply degree regularization constraints to guide representation learning , To retain the key structural information of the node ( Rudu ), This information may be due to VAE The principles of the framework are lost
Last , We construct a unified objective function , By retaining the structural features and node degree, we can learn the role discovery of node embedding
1. Introduction
Community based network embedding ( Preserve node proximity )
- DeepWalk [14] and node2vec [8] , By random walk
- GCN[15] and GraphSAGE[16], Spread features through the Internet
However , Role is another important point in complex networks , It is fundamentally different from the concept of community .
The definition of a role is defined as Classes of equivalent nodes [17], Used to represent the structural similarity between nodes .
let me put it another way , A role is a node partition that satisfies the following conditions : Nodes are structurally more similar to nodes in the same partition than to nodes outside the partition
Intuitively speaking , The role represents the connection mode of the nodes in the neighborhood .
Pictured 1( Upper figure ) Shown , Two red nodes share the same role , Because they can be classified as bridges of two subgraphs , The Yellow node has the same role , Because they are all centers , Even if they are far away from each other or not connected
therefore , Role based network embedding is designed to project the network into the embedding space , In this space , Nodes of the same character are more closely connected .
in summary , Although these two problems ( Based on community or role ) They are all devoted to the clustering of nodes in the network , But they have essential differences and their respective priorities [18].
Community based network embedding methods usually can not solve role-based tasks , Because of them Unable to capture and model the structural features of nodes
lately , Some methods have been proposed to Get structural similarity and Get role based Network embedding .
One of the most representative methods is based on random walk .
The first challenge of character discovery is : Nodes in the same character may not be directly connected or away from each other . To solve this problem , Some methods improve the ordinary random walk .
for example ,role2vec[12] use Random walk based on feature Instead of based on id To generate a sequence as a random walk SkipGram[19] The input of .
However , It is still unable to handle the structural identity extraction of remote nodes , Can not be applied to large-scale networks .
Some ways to try Using adjacency matrix Instead of random walks, it can alleviate the problem of non connectivity between nodes of the same role .
However , The adjacency matrix of the network can contain first-order and higher-order relations , Can not directly express structural similarity .
Therefore, adjacency matrix is not suitable for simply generating role-based embedding .
then , Some methods turn to Structural feature extraction based on adjacency matrix For role discovery . for example ,refex[20] Is a recursive aggregation of neighborhood features and is based on egonet An effective method for characterizing . It has good performance , A relatively compact representation is obtained .
But as the number of iterations increases , Each node will encounter a higher-order neighborhood , this May cause over fitting , As a result, the features can not truly reflect the structural information .
RolX[21] and GLRD[22] utilize REFEX Feature and decompose the feature matrix to obtain a compact node representation .
However , Based on matrix factorization There are ways to High computational complexity The problem of . what's more , The features they use depend on neighborhood information , This makes them useful for Noise is very sensitive , And the embeddedness they generate loses some statistical characteristics of nodes , Such as node degree .
Inspired by deep learning ,DRNE[23] Using the layer normalized by node degree LSTM To recursively aggregate neighbor representations to obtain rule equivalence . This mechanism can effectively capture the structural similarity between remote nodes , But the noise is still ignored .
most important of all , To learn role-based node embedding , We have to The challenges of non connectivity and noise are solved while maintaining the statistical characteristics of nodes .
So , We propose a deep learning framework for structural feature reconstruction based on degree regularity constraints , Called role-based network embedding .
The framework consists of feature extraction 、 Embedding generation and topology feature preservation Three parts .
First, the high-order structural features of each node are extracted . In this paper , We choose REFEX As a sample feature extraction method , To improve its efficiency and effectiveness .
then , Using the variational automatic encoder on the characteristic matrix (VAE) To reduce noise , Get a more compact representation , And it can capture the nonlinear relationship between input and node embedding , It has good robustness , Avoid over fitting .
However ,VAE The coding process may lead to the loss of some key structural features of nodes .
therefore , stay VAE To maintain the degree of nodes , Play an auxiliary role in guiding the optimization direction .
Our main contribution :
- This paper proposes a new framework for role-based network embedding through structural feature reconstruction . This process allows for a more compact representation of nodes , And improve the performance of the elements
- We use the variational auto coder to capture the nonlinear relationship between input and node embedding , While maintaining structural similarity . It can also reduce the noise in the original network , Increase the robustness of the representation .
- The coding process lost its structural features ( For example, degree ), We confirm that node degree is important for role assignment . therefore , We are VAE A degree regularization constraint is added to the , To maintain structural consistency and improve performance .
- We use our model RESD Several extensive experiments have been carried out on real networks , The results are compared with other advanced methods . The experimental results prove the superiority of the model , It is proved that the model can well adapt to the changes of network scale and dimension .
2. Related work
Traditional research is generally Define a character through a graphical representation . One of the most popular methods is the block model , It builds a framework , Nodes represent roles , Edges represent the interaction between characters
Concor[25] Is the first person to recursively calculate the adjacency matrix correlation .
Random block model (SBM)[26] It's a classic way , It introduces random equivalence into role identity , therefore , If the probability of linking to all other nodes is the same , The node is in the same role
Then on SBM Improved , To enable them to better discover roles . for example , Mixed member random block model (MMSB)[27] Using the idea of block model , And assume that a node can belong to multiple roles .
Other traditional methods Using the mechanism of feature engineering , Shift focus from graphical representation to node representation .
They tried different metrics to evaluate some specific structural consistency of nodes , Such as degree centricity 、 Centrality of eigenvectors [28]、PageRank Centrality [29] And structural void value [30], These measures may show good results in some special tasks .
REFEX[20] It is an effective method to obtain structural features . It recursively aggregates neighborhood features and is based on eGonet Characteristics of , To get global structure information . But they are still difficult to apply to discovering node roles and complex tasks , Because the saved structure information is too limited
lately , In order to capture comprehensive role information and apply it to complex machine learning tasks , Researchers are trying to take advantage of network embeddedness . Depending on the embedding mechanism , Most of these role-based network embedding methods can be divided into three categories :
- Language model based approach
- Matrix factorization method
- Deep learning method
1) Language model based on random walk Skip-Gram[19]
Classical network embedding method based on random walk , Such as DeepWalk[14] and node2vec[8], Are bound to the community , Because they're based on id The random walk of , But they Not applicable to role discovery .
Some methods try to improve random walk , To make it suitable for role detection .
- The most effective model for such role discovery is struc2vec[31]. The algorithm uses dynamic time warping (DTW) Distance calculates multiple feature-based similarity between all node pairs , Then construct a complete graph hierarchy , The edge weight is transformed by feature-based similarity and used to calculate the probability of random walk .
- lately ,role2vec[12] Based on Motif Random roaming is applied to the feature of ,
- node2bit[32] The same idea is applied to the user stitching task , Temporary roaming is also used CTDNE[33].
- RiWalk[34] A new random walk method is designed to approximate the graph kernel on a subgraph .
2) Matrix decomposition mechanism
Most of the existing methods construct Characteristic matrix or similarity matrix , And then through Matrix factorization computes compact nodal representations
- RolX[21] Is the first to use this mechanism to generate role-based node embedding . It USES REFEX[20] Extracted structural features , Low dimensional nodal representation is obtained by matrix decomposition . Practice has proved , This model has great potential in role discovery .
- GLRD[22] Taking advantage of these features
- Regal[35] Medium xNetMF An eigenmatrix is also constructed and decomposed
- Hone[36] By decomposing a set of Motif Matrix to generate higher-order network embedding . They can effectively extract the structural similarity between nodes .
- Some methods use classical methods to calculate pairwise similarity .Reaction[37] application RoleSim[38] Matrix to obtain the similarity between each pair of nodes , To get the node representation
- SEGK[39] The similarity matrix of graph kernel computing is decomposed .
- GraphWave[40] The embedding is also generated by applying the graph diffusion kernel and treating it as a probability distribution on the graph .
- Another job [41] Four methods based on nonnegative matrix factorization are designed (NMF) Multi layer network embedding algorithm , While maintaining the structural characteristics . But this method based on matrix decomposition has high computational and spatial complexity , Not applicable except for RolX Large scale network beyond .
3) Deep learning framework based on neural network , To capture the highly nonlinear relationship between network topology and node representation [43].
- DRNE[23] Application layer normalized LSTM[44] To learn regular equivalence . The algorithm samples the neighborhood of each node and sorts by node degree , Then the representation of neighborhood is regarded as the regularization of multi-layer perceptron LSTM The input of , Reconstruct the node degree
- Demo-Net[45] A degree specific GCN, It uses different parameters according to the node degree , But it is aimed at semi supervised conditions
- DMER[46] Make use of adjacency GCN Encoder and REFEX[20] Functional automatic encoder
- GAS[47] It also makes use of the GCN To generate node embeddedness
However , These deep learning methods are still sensitive to noise .
3. Method
3.1. Notions
G = { V , E } G = \{V, E\} G={ V,E}
N ( u ) = { u ∈ V ∣ ( u , v ) ∈ E } N(u)=\{u \in V| (u, v) \in E\} N(u)={ u∈V∣(u,v)∈E} : node u u u The neighborhood of
d ( u ) d(u) d(u): node u u u Degree
the one-hop egonet of node v i v_i vi: G i = { V ( g i ) , E ( g i ) } G_i = \{V(g_i), E(g_i)\} Gi={ V(gi),E(gi)}
- V ( g i ) = { v i } ∪ { v j ∈ V ∣ ( v i , v j ) ∈ E } V(g_i)=\{v_i\} \cup \{v_j \in V| (v_i,v_j)\in E\} V(gi)={ vi}∪{ vj∈V∣(vi,vj)∈E}, And the summit v i v_i vi Set of connected nodes
- E ( g i ) = { ( u , v ) ∈ E ∣ u , v ∈ V ( g i ) } E(g_i)=\{(u,v)\in E| u,v \in V(g_i)\} E(gi)={ (u,v)∈E∣u,v∈V(gi)}: node v i v_i vi The edge set of vertices between its neighborhood set ( That is to say V ( g i ) V(g_i) V(gi) The subgraphs that make up the original graph )
X = R n × m \boldsymbol X= \mathbb{R}^{n \times m} X=Rn×m: Represents the extracted feature space ,m For the dimension of characteristics
Low dimensional vector space ( The goal is ) R k \mathbb{R}^k Rk, among k < < ∣ V ∣ k << |V| k<<∣V∣
3.2. Model
RESD Frame diagram
RESD There are three parts
- The process of character feature extraction
- A variational automatic encoder for combining features into node representations
- Node degree based regularization for enhancing the representation ability of learning embedding .
3.2.1. Feature extraction
If two nodes u u u and v v v Have similar structural identity , Then they tend to share the same roles .
Although the adjacency matrix of a graph contains all the topological information , But because adjacency matrix is usually sparse , The nodes may be far apart , Even disconnected , It is difficult to compare the structural consistency of two nodes .
therefore , Structural similarity cannot be learned directly from adjacency matrix , It is necessary to extract compact high-order structural features from adjacency relations .
Early work has shown that , many High order feature extraction is effective in role discovery , Based on Motif Characteristics of counting [12]、 Multi hop neighbor degrees [31] and REFEX[20]. All of these methods can serve as a framework RESD Feature extraction in
ad locum , We used refex[20]
Because of its efficiency and effectiveness , It aggregates simple local and egonet Features to generate high-order structural features
This method extracts one local feature and five egonet Characteristics of .
For the node v i v_i vi, The local characteristic is nodal degree d ( v i ) d(v_i) d(vi), be based on G i G_i Gi Of egonet The characteristics are defined as follows :
3.2.2. Variational auto-encoder
Because there is a lot of noise in the real network , The extracted high-order structural features are often very sensitive to noise , Can not meet the needs of role discovery .
therefore , We use a variational auto encoder (VAE)[24] The structural features are encoded as nodes
VAE Reference material :
https://zhuanlan.zhihu.com/p/418352070
https://zhuanlan.zhihu.com/p/260627899#:~:text=%E5%8F%AF%E5%8F%98%E8%87%AA%E5%8A%A8%E7%BC%96%E7%A0%81%E5%99%A8%E8%83%BD,%E7%90%86%E6%83%B3%E8%BE%93%E5%87%BA%E7%9A%84%E9%97%B4%E9%9A%99%E3%80%82
VAE It can reduce noise , Capture the nonlinear relationship between input features . And ordinary automatic encoder (AE) comparison ,VAE Effectively The over fitting problem is avoided , Improved robustness of output embedding
We use a multilayer perceptron model instead of a graphical neural network (GNN) As an encoder
because GNN Always Smooth embedding The fact that the extracted feature is negated .
In our proposal RESD Model VAE part , Given the extracted node v j v_j vj Eigenvector of X j X_j Xj, Encoder The definition is as follows :
- W l W^l Wl and b l b^l bl It's No l Layer weights and offsets bias
- L: Number of hidden layers in the encoder
- tanh(·) Is a hyperbolic tangent activation function
- h j 0 = X j h^0_j=X_j hj0=Xj: The input of the encoder
Let's assume that nodes v j v_j vj It means Z j Z_j Zj Obey Gauss distribution P j ( Z j ) \mathbb{P_j}(Z_j) Pj(Zj), among Z j Z_j Zj All elements of are independent , also P j ( Z j ) = N j ( μ j , d i a g ( σ j 2 ) ) \mathbb{P_j}(Z_j)={N_j}(μ_j, diag(σ^2_j)) Pj(Zj)=Nj(μj,diag(σj2))
The distribution can be learned by the following formula P j ( Z j ) \mathbb{P_j}(Z_j) Pj(Zj) Parameters in μ j μ_j μj and σ j σ_j σj
- among W μ 、 b μ 、 W σ W^μ、b^μ、W^σ Wμ、bμ、Wσ and b σ b^σ bσ The weights and offsets of the last layer of the encoder , The previous parameters are shared
We are right. σ σ σ Take the logarithm , To ensure that the standard deviation is always positive .
then , We can use the reparameterization technique [24] Get the node representation Zj, This allows us to calculate μ j μ_j μj and σ j σ_j σj Gradient of
among , ⊙ ⊙ ⊙ Is an element level product , ε ε ε Obey standard Gaussian distribution
about RESD Decoder part of , We also use the multilayer perceptron model :
- W ^ l \hat W^l W^l and b ^ l \hat b^l b^l It is the... In the decoder l l l Weight and deviation of hidden layer
- h ^ j 0 = Z j \hat h^0_j = Z_j h^j0=Zj Input to the decoder
- X ^ \hat X X^ Is the reconstructed characteristic matrix , among X ^ j = h ^ j L \hat X_j =\hat h^L_j X^j=h^jL
Last , We will VAE Loss is defined as :
3.2.3. Regularization
Despite the above VAE The model has the ability to generate representations , But when estimating the distribution of input characteristics , it Some key structural information may be lost
therefore , In embedded information Adding constraints To preserve critical structural information is critical .
From the structure equivalence of traditional graph based role discovery methods 、 Rules such as equivalence can be inferred , Nodes that play the same role have similar degrees .
therefore , Degree is the most important structural feature of a node , And through many methods such as Demo-Net[45] and DRNE[23] It is verified by experiments
Be similar to DRNE[23],RESD A hierarchy based regularizer is defined as follows :
- among MLP(·) Is a modified linear unit active ReLU(·) Multi layer perceptron model .
We train our deep learning framework by jointly minimizing the loss of feature reconstruction and degree regularization constraints RESD, As shown below :
- among β Is the weight of the degree based regularizer
RESD The algorithm pseudo code is as follows :
Personal understanding :
- First, according to the adjacency matrix of the graph Use refex Extract feature matrix
- Then traverse the nodes in the graph in turn
- When traversing to each node :
- First use the equation (3) Gain due to Encoder The resulting vector Z j Z_j Zj
- Then use the equation (5) Calculate due to VAE The loss gained
- Then use the equation (7) Calculate the total loss after adding the regular constraint
- Take advantage of this total loss Back propagation Update parameters
- Apply the trained parameters , Get the node representation matrix Z( Is the last vector space we need )
Z Z Z It is obtained by the encoder Z j Z_j Zj Composed of , That is to say The result of the encoder is the embedding in the low dimensional space we need !
3.3. Computational complexity analysis
Given network G, set up
- n Represents the number of nodes ,
- e Indicates the number of sides ,
- r Express ReFeX The number of characteristic aggregations of ,
- m Represents the extracted feature matrix X Dimension of
- k Indicates that the node is embedded Z Dimension of
ReFeX The whole process of O ( r m ( e + n m ) ) O(rm(e + nm)) O(rm(e+nm))
Because the calculation amount of each feature aggregation is greater than that of the previous aggregation .
Each node uses VAE Express , The calculation complexity is O ( m 2 k ) O(m^2k) O(m2k), Regularization takes time O ( k 2 ) O(k^2) O(k2).
therefore k < m k < m k<m The embedded generation over the entire network is calculated as O ( n m 2 k ) O(nm^2k) O(nm2k).
because r It's always small , And r < k < m ≪ n r < k < m≪n r<k<m≪n, therefore RESD The whole calculation of O ( r e m + n m 2 k ) O(rem+nm^2k) O(rem+nm2k).
therefore , our RESD Framework for spending O ( n m 2 k ) O(nm2k) O(nm2k) The large-scale network of time has great potential .
4. Experiments
4.1. Baselines
- node2vec [8]
- role2vec [12]
- struc2vec [31]
- RolX [21]
- DRNE [23]
- struc2gauss [48]
4.2. Datasets
Data sets ( Undirected power map )
4.3. Experiment settings
In the part of feature extraction , We use ReFeX, Set up
- The characteristic aggregation times are 3 Time
- The number of boxes is set to 5 Time
about VAE The encoder and decoder of , We will
- The number of hidden layers is set to 2
For degree based regularizers , We used a with ReLU Activate the 2 layer MLP Model
We use a learning rate of 0.001 Of Adam SGD Optimizer [52], Set the weight to 0.001 Of L2 Regularization to avoid over fitting .
For most 50 individual epoch, The batch size is set to 32.
In our later experiments , If not specified , The embedded dimension is set to 128,β Set to 0.6.
4.4. Role classification
We randomly input different proportions of data to train an embedded linear logistic regression classifier , The rest is for testing .
We run all baseline methods and RESD 20 Time , And calculate their average on each network F1 score .
To verify the validity of degree preserving constraints , We also ran a Without a degree based regularizer RESD variant , be called RES
The results are as follows :
4.5. Bot detection
In this section , Our goal is to explore our model Can some rare roles in the network be detected .
We chose two Wikipedia chat networks where robots make up only a small part of the total users .
In these networks Detecting robots is very complicated .
- First generate a representation of each user
- For every robot , We Calculate the Euclidean distance between it and other users
- Last , We choose the one closest to each robot K Users , Calculate the average accuracy of the robot
It means something like this :
First of all, for each user( Include bot robot )
And then use the low dimensional representation of bot Find the one closest to it k individual user
Look at this. k individual user in bot What is the proportion of
If this ratio is very high It indicates that the degree of embedding in rare roles is very high ( The better the model )
The results are as follows
- On robot detection tasks K Accuracy of selected users
- OT Indicates that the method cannot be used in 48 Get results within hours .
4.6. Parameter sensitivity and computational complexity
The weight of regularization β And the parameter sensitivity of embedding dimension
Baseline approach and RESD Running time of
- Where the full value represents 48 No results within hours
- The output embedding dimension of all these methods is set to 128
4.7. Visualization
Data set used : Brazil
utilize t-SNE Reduce the embedded dimension to 2 For visualization
- Each node is represented as a point
- The color indicates its role
Ideally , Points of the same color should be very close to each other , And the dots of different colors should be far away .
Visualization results :
5. Conclusion
Insufficient :
- The structural features in the model are too simple
- The encoder ignores the network structure
- This model can only be applied to undirected graphs without authority
In the future work , We plan to use graph neural networks (GNN) To encode features , At the same time, the complex relationships between nodes are extracted .
We also hope to design a novel deep learning framework , Produce more expressive structural features than handmade works .
As far as we know , Most role-based network embedding methods only focus on static networks , The additional attributes on nodes or edges are ignored . In order to better understand the impact of roles , We will further explore dynamic networks , Introduce attributes to learn more efficient representations , This may be an attractive job in the future .
After reading the summary
With the increase of reading papers The speed of reading papers is also faster
Not as strange as before
I still have a little understanding of some of the things mentioned in the article
The main idea of this article is :
- Through the adjacency matrix Use Refex The extracted features
- Re pass VAE Reduce the impact of noise Improve the robustness of the model
- And because of the use of VAE Some structural information will be lost
- Then the regular constraint is introduced : degree
- A combination of the two It is convenient to better extract the high-order features of nodes
Conclusion
The article is only used as a personal study note , Record from 0 To 1 A process of
Hope to help you a little bit , If you have any mistakes, you are welcome to correct them
边栏推荐
- Kotlin Foundation
- Guiding principle - read source code
- Neat Syntax Design of an ETL Language (Part 2)
- Tutorial on installing SSL certificates in Microsoft Exchange Server 2007
- I have summarized the knowledge points of JS [intermediate and advanced] for you
- Android database security: after the user exits, the transaction rollback log still stores relevant data information
- The path of Architects
- WPF Prism框架
- MongoDB的原理、基本使用、集群和分片集群
- Pytorch_ Geometric (pyg) uses dataloader to report an error runtimeerror: sizes of tenants must match except in dimension 0
猜你喜欢
How to develop wechat applet? How to open a wechat store
Free applet making tool, how to make wechat applet
Basic use and cluster construction of consult
Houdini图文笔记:Your driver settings have been set to force 4x Antialiasing in OpenGL applications问题的解决
How to make small programs on wechat? How to make small programs on wechat
Deep understanding of JVM - JVM memory model
链表 删除链表中的节点
Can two Mitsubishi PLC adopt bcnettcp protocol to realize wireless communication of network interface?
如何在Microsoft Exchange 2010中安装SSL证书
Yolov5更换上采样方式
随机推荐
I'm afraid of the goose factory!
ScheduleMaster分布式任务调度中心基本使用和原理
View. post VS Handler. Differences and usage scenarios of post
【RPC】I/O模型——BIO、NIO、AIO及NIO的Rector模式
ShardingSphere-Proxy 4.1 分库分表
(forwarding articles) after skipping multiple pages, shuttle returns to the first page and passes parameters
‘Flutter/Flutter. h‘ file not found
Basic use and principle of Minio
SQL to object thinking vs SQL of object thinking
Flask blog practice - realize personal center and authority management
Processing picture class library
How much does a small program cost? How much does a small program cost? It's clear at a glance
Tiktok brand goes to sea: both exposure and transformation are required. What are the skills of information flow advertising?
Download the arm64 package of Debian on X86 computer
String longest common prefix
Neat Syntax Design of an ETL Language (Part 2)
How to "transform" small and micro businesses (I)?
Redis (II) distributed locks and redis cluster construction
Cocopod error failed: undefined method `map 'for nil:nilclass
WPF Prism框架