LINE: Large-scale Information Network Embedding (Machine Learning with Graphs)

LINE: Large-scale Information Network Embedding (Machine Learning with Graphs)

May 15, 2021
|
27 views
Details
#graphs #embeddings #ml This research paper is one of the early classic papers in area of Machine Learning with Graphs. It embeds information networks into low-dimensional spaces. Information networks such as airline, communication networks and so on have pretty huge sizes and range from hundreds of nodes to millions and billions of nodes. For this reason the paper also in-corporates and proposes optimization schemes for effecient computation. Watch video to know more :) ⏩ Abstract: This paper studies the problem of embedding very large information networks into low-dimensional vector spaces, which is useful in many tasks such as visualization, node classification, and link prediction. Most existing graph embedding methods do not scale for real world information networks which usually contain millions of nodes. In this paper, we propose a novel network embedding method called the "LINE," which is suitable for arbitrary types of information networks: undirected, directed, and/or weighted. The method optimizes a carefully designed objective function that preserves both the local and global network structures. An edge-sampling algorithm is proposed that addresses the limitation of the classical stochastic gradient descent and improves both the effectiveness and the efficiency of the inference. Empirical experiments prove the effectiveness of the LINE on a variety of real-world information networks, including language networks, social networks, and citation networks. The algorithm is very efficient, which is able to learn the embedding of a network with millions of vertices and billions of edges in a few hours on a typical single machine. The source code of the LINE is available online. Please feel free to share out the content and subscribe to my channel :) ⏩ Subscribe - https://youtube.com/channel/UCoz8NrwgL7U9535VNc0mRPA?sub_confirmation=1 ⏩ OUTLINE: 0:00 - Abstract 02:47 - First order and Second Order Proxity Visualization in graphs 04:16 - Problem Definition (Information Network) 04:56 - Problem Definition (First Order Proximity) 05:28 - Problem Definition (Second Order Proximity) 06:03 - Problem Definition (Large-scale information network embedding) 07:25 - Objective for 1st order proximity in LINE 09:03 - KL-Divergence to Cross entropy derivation 10:31 - Objective for 2nd order proximity in LINE 12:42 - Combining first-order and second-order proximities 13:07 - Model Optimization via Negative Sampling 16:11 - Model Optimization via Edge Sampling 17:22 - LINE vs DeepWalk vs Node2Vec ⏩ Paper Title: LINE: Large-scale Information Network Embedding ⏩ Paper: https://arxiv.org/abs/1503.03578 ⏩ Code: https://github.com/tangjianpku/LINE ⏩ Author: Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, Qiaozhu Mei ⏩ Organisation: Microsoft Research Asia, Peking University, University of Michigan ⏩ IMPORTANT LINKS Full Playlist on Machine Learning with Graphs: https://www.youtube.com/watch?v=-uJL_ANy1jc&list=PLsAqq9lZFOtU7tT6mDXX_fhv1R1-jGiYf ********************************************* ⏩ Youtube - https://www.youtube.com/c/TechVizTheDataScienceGuy ⏩ Blog - https://prakhartechviz.blogspot.com ⏩ LinkedIn - https://linkedin.com/in/prakhar21 ⏩ Medium - https://medium.com/@prakhar.mishra ⏩ GitHub - https://github.com/prakhar21 ⏩ Twitter - https://twitter.com/rattller ********************************************* Please feel free to share out the content and subscribe to my channel :) ⏩ Subscribe - https://youtube.com/channel/UCoz8NrwgL7U9535VNc0mRPA?sub_confirmation=1 Tools I use for making videos :) ⏩ iPad - https://tinyurl.com/y39p6pwc ⏩ Apple Pencil - https://tinyurl.com/y5rk8txn ⏩ GoodNotes - https://tinyurl.com/y627cfsa #techviz #datascienceguy #ml_with_graphs #node #representation #learning

0:00 Abstract 02:47 First order and Second Order Proxity Visualization in graphs 04:16 Problem Definition (Information Network) 04:56 Problem Definition (First Order Proximity) 05:28 Problem Definition (Second Order Proximity) 06:03 Problem Definition (Large-scale information network embedding) 07:25 Objective for 1st order proximity in LINE 09:03 KL-Divergence to Cross entropy derivation 10:31 Objective for 2nd order proximity in LINE 12:42 Combining first-order and second-order proximities 13:07 Model Optimization via Negative Sampling 16:11 Model Optimization via Edge Sampling 17:22 LINE vs DeepWalk vs Node2Vec
Comments
loading...