Graphs are commonly used in different real-world applications, e.g., ocial networks are large graphs of people that follow each other. Graph embeddings are the transformation of property graphs to a vector or a set of vectors. Embedding should capture the graph topology, vertex-to-vertex relationship, and other relevant information about graphs, subgraphs, and vertices. More properties embedder encode better results can be retrieved in later tasks.
Distributed Representations of Words and Phrases and their Compositionality
Word2vec is one of the most commonly used method to convert the text data to feature vector. It utilizes the co-occurences of words (n-gram) in document corprous to compute the embedding. It formluates the embedding as one of the following problems:
- cbow: predict the word in the middle of a sentence;
- skip-gram: predict the words before and after the given words.
DeepWalk: Online Learning of Social Representations proposed in 2014 was one of most important papers for graph embedding idea. The idea is that:
- starting from a node, start a random walk on the graph;
- feed the sequence of nodes visited on the random walk into sequence embedding method, e.g., word2vec;
The most important step is the first one, more specifically, how to define the random walk. In DeepWalk, the transition probability from Node a to Node b is proportional to the the weight between a and b.
Large scale information network embedding
Line considers the similarity of vertices in two aspects:
- 1st order: weights of edges between two vertices
- 2nd order: how many neighbors do the vertices share
Learning Graph Representations with Global Structural
Compared with Line, GraRep considers higher order similarities, where k-order similarity is computed as the probability of traversing from vertex to the other vertex within k-hoops.
Compared with DeepWalk, node2vec: Scalable Feature Learning for Networks improves homophily and structural equivalence by using a different transition probability:
- homophily: the nodes close to each other in the graph should have similar embedding vector;
- structural equivalence: the nodes which has similar neighbor structure should have similar embedding vector;
BFS prefers homophily and DFS prefers structural equivalence, which could be controlled by transition probablity. Especially, the transition probablity is computed as:
where $w_vc$ is the weight of the edge between v and c; $d_tx$ is the distance between t and x. Smaller p encourages BFS; Smaller q encourages DFS.
struc2vec:Learning Node Representations from Structural Identity
struc2vec proposes that embedding should not consider the direct connectivty of two vertices, but the similarity of structure of two vertices in the network. Note in many applications, similar vertices does tend to be connected to each other; but this may not hold in all applications.
Let $f_k(u,v)$ is the similarity of two nodes up to k-distance:
$g$ is function for measuring similarity of two sequence, $R$ is a traverse sequence.
Inductive representation learning on large graphs
All the methods above need to redo all the computation, when the graph changes. GraphSAGE tries to address this problem by learning aggregate functions instead of the embedding itself. It contains following steps:
- for each vertex, sampling fixed number of neighbors;
- aggregate the features from its neighbor and concate the aggregated feature to current feature;
- repeat for k steps.
The aggregate functions should be orderless, where some typically choices are:
- average pooling
- max pooling
- GCN aggregator
Context-Aware Network Embedding for Relation Modeling
CANE considers the similarity not only in the structure but also the in the property of vertices and edges. It utilizes word2vec to convert the text associated with each node to a list of feature vectors then a matrix; CNN is applied to extract feature from the matrix; then row pooling and column pooling is applied to extract attention vector.
Structural Deep Network Embedding
SDNE is based on auto-encoder. It takes the adjacency matrix as input and considers two conditions:
- two conected vertices should have similar embeddings
- the embedding should be able reconstruct the input
GraphGAN: Graph Representation Learning with Generative Adversarial Nets
The basic idea of GraphGAN is generator will generate a new edges in the graph and discriminator will predict whether an edge exists in the graph or not. On the ideal case, generator will be able to generate edges (based on similarity of embeddings of two edges) where discriminator cannot tell whether it exists in the original graph or not. That means, the embedding captures the 1st order information of the graph.
Learning Structural Node Embeddings via Diffusion Wavelets
Enhanced Graph Embedding with Side Information improves DeepWalk by incorporating side information. In the recommendation system, code start is a chanllenging but common problem, where a new vertex has few connections to the existing vertices in the graph. DeepWalk cannot generate meaningful embedding for the new vertex. EGES propose to use side information to address such problem.
The idea of EGES is intuitive: using different side information to generate different embeddings; then applying average pool to combine those embeddings.