Gaining insight into SARS-CoV-2 infection and COVID-19 severity using self-supervised edge features and Graph Neural Networks

ICML 2020

Graph Neural Networks (GNN) have been extensively used to extract meaningful representations from graph structured data and to perform predictive tasks such as node classification and link prediction. In recent years, there has been a lot of work incorporating edge features along with node features for prediction tasks. In this work, we present a framework for creating new edge features, via a combination of self-supervised and unsupervised learning which we then use along with node features for node classification tasks. We validate our work on two biological datasets comprising of single-cell RNA sequencing data of \textit{in vitro} SARS-CoV-2 infection and human COVID-19 patients. We demonstrate that our method achieves better performance over baseline Graph Attention Network (GAT) and Graph Convolutional Network (GCN) models. Furthermore, given the attention mechanism on edge and node features, we are able to interpret the cell types and genes that determine the course and severity of COVID-19, contributing to a growing list of potential disease biomarkers and therapeutic targets. Speakers: Arijit Sehanobish, David van Dijk, Neal G. Ravindra