当前位置：网站首页>[paper reading ｜ depth] role based network embedding via structural features reconstruction with degree regulated

[paper reading ｜ depth] role based network embedding via structural features reconstruction with degree regulated

2022-06-25 10:20:00 【Haihong Pro】

Preface

Hello！
Thank you very much for reading Haihong's article , If there are mistakes in the text , You are welcome to point out ～

Self introduction. ଘ(੭ˊᵕˋ)੭
nickname ： Sea boom
label ： Program the ape ｜C++ player ｜ Student
brief introduction ： because C Language and programming , Then I turned to computer science , Won a National Scholarship , I was lucky to win some national awards in the competition 、 Provincial award … It has been insured .
Learning experience ： Solid foundation + Take more notes + Knock more code + Think more + Learn English well ！

Only hard work

Know what it is Know why ！

This article only records what you are interested in

ABSTRACT

Role-based network embedding aims to map the network into a low dimensional node representation , While maintaining structural similarity .

Adjacency matrix contains local information of the network , It also contains the global information of the network , But it can not directly express the role of nodes .

therefore , Extract high-order structural features from adjacency matrix For role-based network embedding ( Structural equivalence ) crucial .

Generally speaking , In real networks , Features extracted in the same way Sensitive to noise , But it doesn't really represent the role of nodes .

therefore , We propose a deep learning framework RESD

First , We propose to extract the high-order structural features of each node in the network
then , We use variational self coding (VAE) Modeling the nonlinear relationship of features , Reduce noise , Improve the robustness of embedding
Besides , In embedded space , We apply degree regularization constraints to guide representation learning , To retain the key structural information of the node ( Rudu ), This information may be due to VAE The principles of the framework are lost
Last , We construct a unified objective function , By retaining the structural features and node degree, we can learn the role discovery of node embedding

1. Introduction

Community based network embedding （ Preserve node proximity ）

DeepWalk [14] and node2vec [8] , By random walk
GCN[15] and GraphSAGE[16], Spread features through the Internet

However , Role is another important point in complex networks , It is fundamentally different from the concept of community .

The definition of a role is defined as Classes of equivalent nodes [17], Used to represent the structural similarity between nodes .

let me put it another way , A role is a node partition that satisfies the following conditions : Nodes are structurally more similar to nodes in the same partition than to nodes outside the partition

Intuitively speaking , The role represents the connection mode of the nodes in the neighborhood .

Insert picture description here

Pictured 1（ Upper figure ） Shown , Two red nodes share the same role , Because they can be classified as bridges of two subgraphs , The Yellow node has the same role , Because they are all centers , Even if they are far away from each other or not connected

therefore , Role based network embedding is designed to project the network into the embedding space , In this space , Nodes of the same character are more closely connected .

in summary , Although these two problems （ Based on community or role ） They are all devoted to the clustering of nodes in the network , But they have essential differences and their respective priorities [18].

Community based network embedding methods usually can not solve role-based tasks , Because of them Unable to capture and model the structural features of nodes

lately , Some methods have been proposed to Get structural similarity and Get role based Network embedding .

One of the most representative methods is based on random walk .

The first challenge of character discovery is ： Nodes in the same character may not be directly connected or away from each other . To solve this problem , Some methods improve the ordinary random walk .

for example ,role2vec[12] use Random walk based on feature Instead of based on id To generate a sequence as a random walk SkipGram[19] The input of .
However , It is still unable to handle the structural identity extraction of remote nodes , Can not be applied to large-scale networks .

Some ways to try Using adjacency matrix Instead of random walks, it can alleviate the problem of non connectivity between nodes of the same role .

However , The adjacency matrix of the network can contain first-order and higher-order relations , Can not directly express structural similarity .

Therefore, adjacency matrix is not suitable for simply generating role-based embedding .

then , Some methods turn to Structural feature extraction based on adjacency matrix For role discovery . for example ,refex[20] Is a recursive aggregation of neighborhood features and is based on egonet An effective method for characterizing . It has good performance , A relatively compact representation is obtained .

But as the number of iterations increases , Each node will encounter a higher-order neighborhood , this May cause over fitting , As a result, the features can not truly reflect the structural information .

RolX[21] and GLRD[22] utilize REFEX Feature and decompose the feature matrix to obtain a compact node representation .

However , Based on matrix factorization There are ways to High computational complexity The problem of . what's more , The features they use depend on neighborhood information , This makes them useful for Noise is very sensitive , And the embeddedness they generate loses some statistical characteristics of nodes , Such as node degree .

Inspired by deep learning ,DRNE[23] Using the layer normalized by node degree LSTM To recursively aggregate neighbor representations to obtain rule equivalence . This mechanism can effectively capture the structural similarity between remote nodes , But the noise is still ignored .

most important of all , To learn role-based node embedding , We have to The challenges of non connectivity and noise are solved while maintaining the statistical characteristics of nodes .

So , We propose a deep learning framework for structural feature reconstruction based on degree regularity constraints , Called role-based network embedding .

The framework consists of feature extraction 、 Embedding generation and topology feature preservation Three parts .

First, the high-order structural features of each node are extracted . In this paper , We choose REFEX As a sample feature extraction method , To improve its efficiency and effectiveness .
then , Using the variational automatic encoder on the characteristic matrix (VAE) To reduce noise , Get a more compact representation , And it can capture the nonlinear relationship between input and node embedding , It has good robustness , Avoid over fitting .
- However ,VAE The coding process may lead to the loss of some key structural features of nodes .
- therefore , stay VAE To maintain the degree of nodes , Play an auxiliary role in guiding the optimization direction .

Our main contribution :

This paper proposes a new framework for role-based network embedding through structural feature reconstruction . This process allows for a more compact representation of nodes , And improve the performance of the elements
We use the variational auto coder to capture the nonlinear relationship between input and node embedding , While maintaining structural similarity . It can also reduce the noise in the original network , Increase the robustness of the representation .
The coding process lost its structural features ( For example, degree ), We confirm that node degree is important for role assignment . therefore , We are VAE A degree regularization constraint is added to the , To maintain structural consistency and improve performance .
We use our model RESD Several extensive experiments have been carried out on real networks , The results are compared with other advanced methods . The experimental results prove the superiority of the model , It is proved that the model can well adapt to the changes of network scale and dimension .

2. Related work

Traditional research is generally Define a character through a graphical representation . One of the most popular methods is the block model , It builds a framework , Nodes represent roles , Edges represent the interaction between characters

Concor[25] Is the first person to recursively calculate the adjacency matrix correlation .
Random block model (SBM)[26] It's a classic way , It introduces random equivalence into role identity , therefore , If the probability of linking to all other nodes is the same , The node is in the same role
Then on SBM Improved , To enable them to better discover roles . for example , Mixed member random block model (MMSB)[27] Using the idea of block model , And assume that a node can belong to multiple roles .

Other traditional methods Using the mechanism of feature engineering , Shift focus from graphical representation to node representation .

They tried different metrics to evaluate some specific structural consistency of nodes , Such as degree centricity 、 Centrality of eigenvectors [28]、PageRank Centrality [29] And structural void value [30], These measures may show good results in some special tasks .

REFEX[20] It is an effective method to obtain structural features . It recursively aggregates neighborhood features and is based on eGonet Characteristics of , To get global structure information . But they are still difficult to apply to discovering node roles and complex tasks , Because the saved structure information is too limited

lately , In order to capture comprehensive role information and apply it to complex machine learning tasks , Researchers are trying to take advantage of network embeddedness . Depending on the embedding mechanism , Most of these role-based network embedding methods can be divided into three categories ：

Language model based approach
Matrix factorization method
Deep learning method

1） Language model based on random walk Skip-Gram[19]

Classical network embedding method based on random walk , Such as DeepWalk[14] and node2vec[8], Are bound to the community , Because they're based on id The random walk of , But they Not applicable to role discovery .

Some methods try to improve random walk , To make it suitable for role detection .

The most effective model for such role discovery is struc2vec[31]. The algorithm uses dynamic time warping (DTW) Distance calculates multiple feature-based similarity between all node pairs , Then construct a complete graph hierarchy , The edge weight is transformed by feature-based similarity and used to calculate the probability of random walk .
lately ,role2vec[12] Based on Motif Random roaming is applied to the feature of ,
node2bit[32] The same idea is applied to the user stitching task , Temporary roaming is also used CTDNE[33].
RiWalk[34] A new random walk method is designed to approximate the graph kernel on a subgraph .

2） Matrix decomposition mechanism

Most of the existing methods construct Characteristic matrix or similarity matrix , And then through Matrix factorization computes compact nodal representations

RolX[21] Is the first to use this mechanism to generate role-based node embedding . It USES REFEX[20] Extracted structural features , Low dimensional nodal representation is obtained by matrix decomposition . Practice has proved , This model has great potential in role discovery .
GLRD[22] Taking advantage of these features
Regal[35] Medium xNetMF An eigenmatrix is also constructed and decomposed
Hone[36] By decomposing a set of Motif Matrix to generate higher-order network embedding . They can effectively extract the structural similarity between nodes .
Some methods use classical methods to calculate pairwise similarity .Reaction[37] application RoleSim[38] Matrix to obtain the similarity between each pair of nodes , To get the node representation
SEGK[39] The similarity matrix of graph kernel computing is decomposed .
GraphWave[40] The embedding is also generated by applying the graph diffusion kernel and treating it as a probability distribution on the graph .
Another job [41] Four methods based on nonnegative matrix factorization are designed (NMF) Multi layer network embedding algorithm , While maintaining the structural characteristics . But this method based on matrix decomposition has high computational and spatial complexity , Not applicable except for RolX Large scale network beyond .

3） Deep learning framework based on neural network , To capture the highly nonlinear relationship between network topology and node representation [43].

DRNE[23] Application layer normalized LSTM[44] To learn regular equivalence . The algorithm samples the neighborhood of each node and sorts by node degree , Then the representation of neighborhood is regarded as the regularization of multi-layer perceptron LSTM The input of , Reconstruct the node degree
Demo-Net[45] A degree specific GCN, It uses different parameters according to the node degree , But it is aimed at semi supervised conditions
DMER[46] Make use of adjacency GCN Encoder and REFEX[20] Functional automatic encoder
GAS[47] It also makes use of the GCN To generate node embeddedness

However , These deep learning methods are still sensitive to noise .

3. Method

3.1. Notions

$G = \{V, E\}$

$N(u)=\{u \in V| (u, v) \in E\}$ : node $u$ The neighborhood of

$d (u)$ ： node $u$ Degree

the one-hop egonet of node $v_i$ : $G_i = \{V(g_i), E(g_i)\}$

$V(g_i)=\{v_i\} \cup \{v_j \in V| (v_i,v_j)\in E\}$ , And the summit $v_i$ Set of connected nodes
$E(g_i)=\{(u,v)\in E| u,v \in V(g_i)\}$ ： node $v_i$ The edge set of vertices between its neighborhood set （ That is to say $V(g_i)$ The subgraphs that make up the original graph ）

$\boldsymbol X= \mathbb{R}^{n \times m}$ ： Represents the extracted feature space ,m For the dimension of characteristics

Low dimensional vector space （ The goal is ） $\mathbb{R}^k$ , among $k < < ∣ V ∣$

Insert picture description here

3.2. Model

RESD Frame diagram
Insert picture description here
RESD There are three parts

The process of character feature extraction
A variational automatic encoder for combining features into node representations
Node degree based regularization for enhancing the representation ability of learning embedding .

3.2.1. Feature extraction

If two nodes $u$ and $v$ Have similar structural identity , Then they tend to share the same roles .

Although the adjacency matrix of a graph contains all the topological information , But because adjacency matrix is usually sparse , The nodes may be far apart , Even disconnected , It is difficult to compare the structural consistency of two nodes .

therefore , Structural similarity cannot be learned directly from adjacency matrix , It is necessary to extract compact high-order structural features from adjacency relations .

Early work has shown that , many High order feature extraction is effective in role discovery , Based on Motif Characteristics of counting [12]、 Multi hop neighbor degrees [31] and REFEX[20]. All of these methods can serve as a framework RESD Feature extraction in

ad locum , We used refex[20]

Because of its efficiency and effectiveness , It aggregates simple local and egonet Features to generate high-order structural features

This method extracts one local feature and five egonet Characteristics of .

For the node $v_i$ , The local characteristic is nodal degree $d(v_i)$ , be based on $G_i$ Of egonet The characteristics are defined as follows ：

Insert picture description here

3.2.2. Variational auto-encoder

Because there is a lot of noise in the real network , The extracted high-order structural features are often very sensitive to noise , Can not meet the needs of role discovery .

therefore , We use a variational auto encoder (VAE)[24] The structural features are encoded as nodes

VAE Reference material ：
https://zhuanlan.zhihu.com/p/418352070
https://zhuanlan.zhihu.com/p/260627899#:~:text=%E5%8F%AF%E5%8F%98%E8%87%AA%E5%8A%A8%E7%BC%96%E7%A0%81%E5%99%A8%E8%83%BD,%E7%90%86%E6%83%B3%E8%BE%93%E5%87%BA%E7%9A%84%E9%97%B4%E9%9A%99%E3%80%82

VAE It can reduce noise , Capture the nonlinear relationship between input features . And ordinary automatic encoder (AE) comparison ,VAE Effectively The over fitting problem is avoided , Improved robustness of output embedding

We use a multilayer perceptron model instead of a graphical neural network (GNN) As an encoder

because GNN Always Smooth embedding The fact that the extracted feature is negated .

In our proposal RESD Model VAE part , Given the extracted node $v_j$ Eigenvector of $X_j$ , Encoder The definition is as follows ：

Insert picture description here

$W^l$ and $b^l$ It's No l Layer weights and offsets bias
L： Number of hidden layers in the encoder
tanh(·) Is a hyperbolic tangent activation function
$h^0_j=X_j$ ： The input of the encoder

Let's assume that nodes $v_j$ It means $Z_j$ Obey Gauss distribution $\mathbb{P_j}(Z_j)$ , among $Z_j$ All elements of are independent , also $\mathbb{P_j}(Z_j)={N_j}(μ_j, diag(σ^2_j))$

The distribution can be learned by the following formula $\mathbb{P_j}(Z_j)$ Parameters in $μ_j$ and $σ_j$

Insert picture description here

among $W^μ、b^μ、W^σ$ and $b^σ$ The weights and offsets of the last layer of the encoder , The previous parameters are shared

We are right. $σ$ Take the logarithm , To ensure that the standard deviation is always positive .

then , We can use the reparameterization technique [24] Get the node representation Zj, This allows us to calculate $μ_j$ and $σ_j$ Gradient of

Insert picture description here
among , $⊙$ Is an element level product , $ε$ Obey standard Gaussian distribution

about RESD Decoder part of , We also use the multilayer perceptron model :

Insert picture description here

$\hat W^l$ and $\hat b^l$ It is the... In the decoder $l$ Weight and deviation of hidden layer
$\hat h^0_j = Z_j$ Input to the decoder
$\hat X$ Is the reconstructed characteristic matrix , among $\hat X_j =\hat h^L_j$

Last , We will VAE Loss is defined as :

Insert picture description here

3.2.3. Regularization

Despite the above VAE The model has the ability to generate representations , But when estimating the distribution of input characteristics , it Some key structural information may be lost

therefore , In embedded information Adding constraints To preserve critical structural information is critical .

From the structure equivalence of traditional graph based role discovery methods 、 Rules such as equivalence can be inferred , Nodes that play the same role have similar degrees .

therefore , Degree is the most important structural feature of a node , And through many methods such as Demo-Net[45] and DRNE[23] It is verified by experiments

Be similar to DRNE[23],RESD A hierarchy based regularizer is defined as follows :

Insert picture description here

among MLP(·) Is a modified linear unit active ReLU(·) Multi layer perceptron model .

We train our deep learning framework by jointly minimizing the loss of feature reconstruction and degree regularization constraints RESD, As shown below :

Insert picture description here

among β Is the weight of the degree based regularizer

RESD The algorithm pseudo code is as follows ：

Insert picture description here
Personal understanding ：

First, according to the adjacency matrix of the graph Use refex Extract feature matrix
Then traverse the nodes in the graph in turn
When traversing to each node ：
- First use the equation （3） Gain due to Encoder The resulting vector $Z_j$
- Then use the equation （5） Calculate due to VAE The loss gained
- Then use the equation （7） Calculate the total loss after adding the regular constraint
- Take advantage of this total loss Back propagation Update parameters
Apply the trained parameters , Get the node representation matrix Z（ Is the last vector space we need ）

$Z$ It is obtained by the encoder $Z_j$ Composed of , That is to say The result of the encoder is the embedding in the low dimensional space we need ！

3.3. Computational complexity analysis

Given network G, set up

n Represents the number of nodes ,
e Indicates the number of sides ,
r Express ReFeX The number of characteristic aggregations of ,
m Represents the extracted feature matrix X Dimension of
k Indicates that the node is embedded Z Dimension of

ReFeX The whole process of $O (r m (e + n m))$

Because the calculation amount of each feature aggregation is greater than that of the previous aggregation .

Each node uses VAE Express , The calculation complexity is $O(m^2k)$ , Regularization takes time $O(k^2)$ .

therefore $k < m$ The embedded generation over the entire network is calculated as $O(nm^2k)$ .

because r It's always small , And $r < k < m ≪ n$ , therefore RESD The whole calculation of $O(rem+nm^2k)$ .

therefore , our RESD Framework for spending $O (n m 2 k)$ The large-scale network of time has great potential .

4. Experiments

4.1. Baselines

node2vec [8]
role2vec [12]
struc2vec [31]
RolX [21]
DRNE [23]
struc2gauss [48]

4.2. Datasets

Data sets （ Undirected power map ）

Insert picture description here

4.3. Experiment settings

In the part of feature extraction , We use ReFeX, Set up

The characteristic aggregation times are 3 Time
The number of boxes is set to 5 Time

about VAE The encoder and decoder of , We will

The number of hidden layers is set to 2

For degree based regularizers , We used a with ReLU Activate the 2 layer MLP Model

We use a learning rate of 0.001 Of Adam SGD Optimizer [52], Set the weight to 0.001 Of L2 Regularization to avoid over fitting .

For most 50 individual epoch, The batch size is set to 32.

In our later experiments , If not specified , The embedded dimension is set to 128,β Set to 0.6.

4.4. Role classification

We randomly input different proportions of data to train an embedded linear logistic regression classifier , The rest is for testing .

We run all baseline methods and RESD 20 Time , And calculate their average on each network F1 score .

To verify the validity of degree preserving constraints , We also ran a Without a degree based regularizer RESD variant , be called RES

The results are as follows ：

Insert picture description here

4.5. Bot detection

In this section , Our goal is to explore our model Can some rare roles in the network be detected .

We chose two Wikipedia chat networks where robots make up only a small part of the total users .

In these networks Detecting robots is very complicated .

First generate a representation of each user
For every robot , We Calculate the Euclidean distance between it and other users
Last , We choose the one closest to each robot K Users , Calculate the average accuracy of the robot

It means something like this ：
First of all, for each user（ Include bot robot ）
And then use the low dimensional representation of bot Find the one closest to it k individual user
Look at this. k individual user in bot What is the proportion of
If this ratio is very high It indicates that the degree of embedding in rare roles is very high （ The better the model ）

The results are as follows

Insert picture description here

On robot detection tasks K Accuracy of selected users
OT Indicates that the method cannot be used in 48 Get results within hours .

4.6. Parameter sensitivity and computational complexity

The weight of regularization β And the parameter sensitivity of embedding dimension

Insert picture description here
Baseline approach and RESD Running time of

Insert picture description here

Where the full value represents 48 No results within hours
The output embedding dimension of all these methods is set to 128

4.7. Visualization

Data set used ： Brazil

utilize t-SNE Reduce the embedded dimension to 2 For visualization

Each node is represented as a point
The color indicates its role

Ideally , Points of the same color should be very close to each other , And the dots of different colors should be far away .

Visualization results ：

Insert picture description here

5. Conclusion

Insufficient ：

The structural features in the model are too simple
The encoder ignores the network structure
This model can only be applied to undirected graphs without authority

In the future work , We plan to use graph neural networks (GNN) To encode features , At the same time, the complex relationships between nodes are extracted .

We also hope to design a novel deep learning framework , Produce more expressive structural features than handmade works .

As far as we know , Most role-based network embedding methods only focus on static networks , The additional attributes on nodes or edges are ignored . In order to better understand the impact of roles , We will further explore dynamic networks , Introduce attributes to learn more efficient representations , This may be an attractive job in the future .

After reading the summary

With the increase of reading papers The speed of reading papers is also faster

Not as strange as before

I still have a little understanding of some of the things mentioned in the article

The main idea of this article is ：

Through the adjacency matrix Use Refex The extracted features
Re pass VAE Reduce the impact of noise Improve the robustness of the model
And because of the use of VAE Some structural information will be lost
Then the regular constraint is introduced ： degree
A combination of the two It is convenient to better extract the high-order features of nodes