Subscribe to the PwC Newsletter

Join the community, edit social preview.

graph representation learning for interactive biomolecule systems

Add a new code entry for this paper

Remove a code repository from this paper, mark the official implementation from paper authors, add a new evaluation result row.

  • DRUG DISCOVERY
  • GRAPH REPRESENTATION LEARNING
  • REPRESENTATION LEARNING

Remove a task

graph representation learning for interactive biomolecule systems

Add a method

Remove a method, edit datasets, graph representation learning for interactive biomolecule systems.

5 Apr 2023  ·  Xinye Xiong , Bingxin Zhou , Yu Guang Wang · Edit social preview

Advances in deep learning models have revolutionized the study of biomolecule systems and their mechanisms. Graph representation learning, in particular, is important for accurately capturing the geometric information of biomolecules at different levels. This paper presents a comprehensive review of the methodologies used to represent biological molecules and systems as computer-recognizable objects, such as sequences, graphs, and surfaces. Moreover, it examines how geometric deep learning models, with an emphasis on graph-based techniques, can analyze biomolecule data to enable drug discovery, protein characterization, and biological system analysis. The study concludes with an overview of the current state of the field, highlighting the challenges that exist and the potential future research directions.

Code Edit Add Remove Mark official

Tasks edit add remove, datasets edit, results from the paper edit, methods edit add remove.

> q-bio > arXiv:2304.02656

  • Other formats

Current browse context:

Change to browse by:, references & citations, quantitative biology > quantitative methods, title: graph representation learning for interactive biomolecule systems.

Abstract: Advances in deep learning models have revolutionized the study of biomolecule systems and their mechanisms. Graph representation learning, in particular, is important for accurately capturing the geometric information of biomolecules at different levels. This paper presents a comprehensive review of the methodologies used to represent biological molecules and systems as computer-recognizable objects, such as sequences, graphs, and surfaces. Moreover, it examines how geometric deep learning models, with an emphasis on graph-based techniques, can analyze biomolecule data to enable drug discovery, protein characterization, and biological system analysis. The study concludes with an overview of the current state of the field, highlighting the challenges that exist and the potential future research directions.

Submission history

Link back to: arXiv , form interface , contact .

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • Advanced Search
  • Journal List
  • HHS Author Manuscripts
  • PMC10699434

Logo of nihpa

Graph Representation Learning in Biomedicine and Healthcare

Michelle m. li.

1 Bioinformatics and Integrative Genomics Program, Harvard Medical School, Boston, MA 02115, USA

3 Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA

Kexin Huang

2 Health Data Science Program, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA

Marinka Zitnik

4 Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA

5 Harvard Data Science Initiative, Cambridge, MA 02138, USA

Associated Data

Biomedical networks (or graphs) are universal descriptors for systems of interacting elements, from molecular interactions and disease co-morbidity to healthcare systems and scientific knowledge. Advances in artificial intelligence, specifically deep learning, have enabled us to model, analyze, and learn with such networked data. In this review, we put forward an observation that long-standing principles of systems biology and medicine—while often unspoken in machine learning research—provide the conceptual grounding for representation learning on graphs, explain its current successes and limitations, and even inform future advancements. We synthesize a spectrum of algorithmic approaches that, at their core, leverage graph topology to embed networks into compact vector spaces. We also capture the breadth of ways in which representation learning has dramatically improved the state-of-the-art in biomedical machine learning. Exemplary domains covered include identifying variants underlying complex traits, disentangling behaviors of single cells and their effects on health, assisting in diagnosis and treatment of patients, and developing safe and effective medicines.

1. Introduction

Networks (or graphs) are pervasive in biology and medicine, from molecular interaction maps to population-scale social and health interactions. With the multitude of bioentities and associations that can be described by networks, they are prevailing representations of biological organization and biomedical knowledge. For instance, edges in a regulatory network can indicate causal activating and inhibitory relationships between genes [ 1 ]; edges between genes and diseases can indicate genes that are ‘upregulated by’, ‘downregulated by’, or ‘associated with’ a disease [ 2 ]; and edges in a knowledge network built from electronic health records (EHR) can indicate co-occurrences of medical codes across patients [ 3 , 4 , 5 ]. The ability to model all biomedical discoveries to date—even overlay patient-specific information—in a unified data representation has driven the development of artificial intelligence, specifically deep learning, for networks. In fact, the diversity and multimodality in networks not only boost performance of predictive models, but importantly enable broad generalization to settings not seen during training [ 6 ] and improve model interpretability [ 7 , 8 ]. Nevertheless, interactions in networks give rise to a bewildering degree of complexity that can likely only be fully understood through a holistic and integrated view [ 9 , 10 , 11 ]. As a result, systems biology and medicine—upon which deep learning on graphs is founded—have identified over the last two decades organizing principles that govern networks [ 12 , 13 , 14 , 15 ].

The organizing principles governing networks link network structure to molecular phenotypes, biological roles, disease, and health , thus providing the conceptual grounding that, we argue, can explain the successes (and limitations) of graph representation learning and inform future development of the field. Here, we exemplify how a series of such principles has uncovered disease mechanisms. First, interacting entities are typically more similar than non-interacting entities, as implicated by the local hypothesis [ 13 ]. In protein interaction networks, for instance, mutations in interacting proteins often lead to similar diseases [ 13 ]. Given by the shared components and disease module hypotheses [ 13 ], cellular components associated with the same phenotype tend to cluster in the same network neighborhood [ 16 ]. Further, essential genes are often found in hubs of a molecular network whereas non-essential genes (e.g., those associated with disease) are located on the periphery [ 13 ]. Thus, the network parsimony principle dictates that shortest molecular paths between known disease-associated components tend to correlate with causal molecular pathways [ 13 ]. To this day, these hypotheses and principles continue to drive discoveries.

We posit that representation learning can realize network biomedicine principles . Its core idea is to learn how to represent nodes (or larger graph structures) in a network as points in a low-dimensional space, where the geometry of this space is optimized to reflect the structure of interactions between nodes. More concretely, representation learning specifies deep, non-linear transformation functions that map nodes to points in a compact vector space, termed embeddings . Such functions are optimized to embed the input network so that nodes with similar network neighborhoods are embedded close together in the embedding space, and algebraic operations performed in this learned space reflect the network’s topology. To provide concrete connections between graph representation learning and systems biology and medicine: nodes in the same positional regions should have similar embeddings due to the local hypothesis (e.g., highly similar pairs of protein embeddings suggest similar phenotypic consequence); node embeddings can capture whether the nodes lie within a hub based on their degree, an important aspect of local neighborhood (e.g., strongly clustered gene embeddings indicate essential housekeeping roles); and given by the shared components hypothesis, two nodes with significantly overlapping sets of network neighbors should have similar embeddings due to shared message passing (e.g., highly similar disease embeddings imply shared disease-associated cellular components). Hence, artificial intelligence methods that produce representations can be thought of as differentiable engines of key network biomedicine principles.

Our survey provides an exposition of graph artificial intelligence capability and highlights important applications for deep learning on biomedical networks. Given the prominence of graph representation learning, specific aspects of it have been covered extensively. However, existing reviews independently discuss deep learning on structured data [ 17 , 18 ]; graph neural networks [ 19 , 20 , 21 ]; representation learning for homogeneous and heterogeneous graphs [ 22 , 23 , 24 ], solely heterogeneous graphs [ 25 ], and dynamic graphs [ 26 ]; data fusion [ 27 ]; network propagation [ 28 ]; topological data analysis [ 29 ]; and creation of biomedical networks [ 9 , 29 , 30 , 31 , 32 ]. Biomedically-focused reviews survey graph neural networks exclusively on molecular generation [ 33 , 34 ], single-cell biology [ 35 ], drug discovery and repurposing [ 36 , 37 , 38 , 39 , 40 ], or histopathology [ 41 ]. Other reviews tend to focus solely on graph neural networks, excluding other graph representation learning approaches or do not consider patient-centric methods [ 42 ]. In contrast, our survey unifies graph representation learning approaches across molecular, genomic, therapeutic, and precision medicine areas.

2. Graph representation learning

Graph theoretic techniques have fueled discoveries, from uncovering relationships between diseases [ 43 , 44 , 45 , 46 ] to repurposing drugs [ 6 , 47 , 48 ]. Further algorithmic innovations, such as random walks [ 49 , 50 , 51 ], kernels [ 52 ], and network propagation [ 53 ], have also played a role in capturing structural information in networks. Feature engineering, the process of extracting predetermined features from a network to suit a user-specified machine learning method [ 54 ], is a common approach for machine learning on networks, including but not limited to hard-coding network features (e.g., higher-order structures, network motifs, degree counts, and common neighbor statistics) and feeding the engineered feature vectors into a machine learning model. While powerful, it can be challenging to hand engineer optimally-predictive features across diverse types of networks and applications [ 18 ].

For these reasons, graph representation learning , the idea of automatically learning optimal features for networks, has emerged as a leading artificial intelligence approach for networks. Graph representation learning is challenging because graphs contain complex topographical structure, have no fixed node ordering or reference points, and are comprised of many different kinds of entities (nodes) and various types of interactions (edges) relating them to each other. Classic deep learning methods are unable to consider such diverse structural properties and rich interactions, which are the essence of biomedical networks, because classic deep methods are designed for fixed-size grids (e.g., images and tabular datasets) or optimized for text and sequences. Akin to how deep learning on images and sequences has revolutionized image analysis and natural language processing, graph representation learning is poised to transform the study of complex systems.

Graph representation learning methods generate vector representations for graph elements such that the learned representations, i.e., embeddings , capture the structure and semantics of networks, along with any downstream supervised task, if any ( Box 1 ). Graph representation learning encompasses a wide range of methods, including manifold learning, topological data analysis, graph neural networks and generative graph models ( Figure 2 ). We next describe graph elements and outline main artificial intelligence tasks on graphs ( Box 1 ). We then outline graph representation learning methods ( Section 2.1 - 2.3 ).

An external file that holds a picture, illustration, etc.
Object name is nihms-1946278-f0002.jpg

(a) Shallow network embedding methods generate a dictionary of representations h u for every node u that preserves the input graph structure information. This is achieved by learning a mapping function f z that maps nodes into an embedding space such that nodes with similar graph neighborhoods measured by function f n get embedded closer together ( Section 2.1 ). Given the learned embeddings, an independent decoder method can optimize embeddings for downstream tasks, such as node or link property prediction. Method examples include Deep Walk [ 55 ], Node2vec [ 56 ], LINE [ 57 ], and Metapath2vec [ 58 ]. (b) In contrast with shallow network embedding methods, graph neural networks can generate representations for any graph element by capturing both network structure and node attributes and metadata. The embeddings are generated through a series of non-linear transformations, i.e., message-passing layers ( L k denotes transformations at layer k ), that iteratively aggregate information from neighboring nodes at the target node u . GNN models can be optimized for performance on a variety of downstream tasks ( Section 2.2 ). Method examples include GCN [ 59 ], GIN [ 60 ], GAT [ 61 ], and JK-Net [ 62 ]. (c) Generative graph models estimate a distribution landscape Z to characterize a collection of distinct input graphs. They use the optimized distribution to generate novel graphs G ^ that are predicted to have desirable properties, e.g., a generated graph can be represent a molecular graph of a drug candidate. Generative graph models use graph neural networks as encoders and produce graph representations that capture both network structure and attributes ( Section 2.3 ). Method examples include GCPN [ 63 ], JT-VAE [ 64 ], and GraphRNN [ 65 ]. SI Figure 1 and SI Note 3 outline other representation learning techniques.

Fundamentals of graph representation learning

Elements of graphs..

Graph G = ( 𝒱 , ℰ ) consists of nodes v ∈ 𝒱 and edges or relations e u , v r = ℰ connecting nodes u and v via a relationship of type r . Subgraph S = ( 𝒱 S , ℰ S ) is a subset of a graph G , where 𝒱 S ⊆ 𝒱 and ℰ S ⊆ ℰ . Adjacency matrix A is used to represent a graph, where each entry A u , v is 1 if nodes u , v are connected, and o otherwise. A u , v can also be the edge weight between nodes u , v . Homogeneous graph is a graph with a single node and edge type. In contrast, heterogeneous graph consists of nodes of different types (node type set 𝒜 ) connected by diverse kinds of edges (edge type set ℛ ). Node attribute vector x u ∈ R d describes side information and metadata of node u . The node attribute matrix X ∈ R n × d brings together attribute vectors for all nodes in the graph. Similarly, edge attributes x u , v e ∈ R c for edge e u , v can be taken together to form an edge attribute matrix X e ∈ R m × c . A path from node u 1 to node u k is given by a sequence of edges u 1 → e 1 , 2 u 1 … u k − 1 → e k − 1 , k u k . For node u , we denote its neighborhood 𝒩 ( u ) as nodes directly connected to u in G , and the node degree is the size of 𝒩 ( u ) . The k -hop neighborhood of node u is the set of nodes that are exactly k hops away from node u , that is, 𝒩 k ( u ) = { v ∣ d ( u , v ) = k } where d denotes the shortest path distance ( SI Note 1 ).

Artificial intelligence tasks on graphs.

To extract this information from networks, classic machine learning approaches rely on summary statistics (e.g., degrees or clustering coefficients) or carefully engineered features to measure network structures (e.g., network motifs). In contrast, representation learning approaches automatically learn to encode networks into low-dimensional representations (i.e., embeddings) using transformation techniques based on deep learning and nonlinear dimensionality reduction. The flexibility of learned representations shows in a myriad of tasks that representations can be used for ( SI Note 2 ):

  • Node, link, and graph property prediction: The objective is to learn representations of graph elements, such as nodes, edges, subgraphs, and entire graphs. Representations are optimized so that performing algebraic operations in the embedding space reflects the graph’s topology. Optimized representations can be fed into models to predict properties of graph elements, such as the function of proteins in an interactome network (i.e., node classification task), the binding affinity of a chemical compound to a target protein (i.e., link prediction task), and the toxicity profile of a candidate drug (i.e., graph classification task).
  • Latent graph learning: Graph representation learning exploits relational inductive biases for data that come in the form of graphs. In some settings, however, the graphs are not readily available for learning. This is typical for many biological problems, where graphs such as gene regulatory networks are only partially known. Latent graph learning is concerned with inferring the graph from the data. The latent graph can be application-specific and optimized for the downstream task. Further, such a graph might be as important as the task itself, as it can convey insights about the data and offer a way to interpret the results.
  • Graph generation: The objective is to generate a graph G representing a biomedical entity that is likely to have a property of interest, such as high druglikeness. The model is given a set of graphs 𝒢 with such a property and is tasked with learning a non-linear mapping function characterizing the distribution of graphs in 𝒢 . The learned distribution is used to optimize a new graph G with the same property as input graphs.

2.1. Shallow graph embedding approaches

Shallow embedding methods optimize a compact vector space such that points that are close in the graph are mapped to nearby points in the embedding space, which is measured by a predefined distance function or an outer product. These approaches are transductive embedding methods where the encoder function is a simple embedding lookup ( Figure 2 ). More concretely, the methods have three steps: (1) Mapping to an embedding space . Given a pair of nodes u and v in graph G , we specify an encoder, a learnable function f that maps nodes to embeddings h u and h v . (2) Defining graph similarity . We define the graph similarity f n ( u , v ) , for example, measured by distance between u and v in the graph, and the embedding similarity f z ( h u , h v ) , for example, an Euclidean distance function or pairwise dot-product. (3) Computing loss . Then, we define the loss ℒ ( f n ( u , v ) , f z ( h u , h v ) ), which quantifies how the resulting embeddings preserve the desired input graph similarity. Finally, we apply an optimization procedure to minimize the loss ℒ ( f n ( u , v ) , f z ( h u , h v ) ). The resulting encoder f is a graph embedding method that serves as a shallow embedding lookup and considers the graph structure only in the loss function.

Shallow embedding methods vary given various definitions of similarities. For example, shortest path length between nodes is often used as the network similarity and dot product as the embedding similarity. Perozzi et al. [ 55 ] define similarity as co-occurrence in a series of random walks of length k . Unsupervised techniques that predict which node belongs to the walk, such as skip-gram [ 66 ], are then applied on the walks to generate embeddings. Grover et al. [ 56 ] propose an alternative way for walks on graphs, using a combination of depth-first search and breadth-first search. In heterogeneous graphs, information on the semantic meaning of edges, i.e., relation types, can be important. Knowledge graph methods expand similarity measures to consider relation types [ 58 , 67 , 68 , 69 , 70 , 71 ]. When shallow embedding models are trained, the resulting embeddings can be fed into separate models to be optimized towards a specific classification or regression task.

2.2. Graph neural networks

Graph neural networks (GNNs) are a class of neural networks designed for graph-structured datasets ( Figure 2 ). They learn compact representations of graph elements, their attributes, and supervised labels if any. A typical GNN consists of a series of propagation layers [ 72 ], where layer l carries out three operations: (1) Passing neural messages . The GNN computes a message m u , v ( l ) = M S G ( h u ( l − 1 ) , h v ( l − 1 ) ) for every linked nodes u , v based on their embeddings from the previous layer h u ( l − 1 ) and h v ( l − 1 ) ) . (2) Aggregating neighborhoods . The messages between node u and its neighbors 𝒩 u are aggregated as m ^ u ( l ) = A G G ( m u v ( l ) ∣ v ∈ 𝒩 u ) . (3) Updating representations . Non-linear transformation is applied to update node embeddings as h u ( l ) = U P D ( m ^ u ( l ) , h u ( l − 1 ) ) using the aggregated message and the embedding from the previous layer. In contrast to shallow embeddings, GNNs can capture higher-order and non-linear patterns through multi-hop propagation within several layers of neural message passing. Additionally, GNNs can optimize supervised signals and graph structure simultaneously, whereas shallow embedding approaches require a two-stage approach to achieve the same.

A myriad of GNN architectures define different messages, aggregation, and update schemes to derive deep graph embeddings [ 59 , 73 , 74 , 75 ]. For example, [ 61 , 76 , 77 , 78 , 79 ] assign importance scores for nodes during neighborhood aggregation such that more important nodes play a larger effect in the embeddings. [ 80 , 81 ] improve GNNs’ ability to capture graph structural information by posing structural priors, such as a higher-order adjacency matrix. Graph pooling techniques [ 82 ] learn abstract topological structures. GNNs designed for molecules [ 83 , 84 ] inject physics-based scores and domain knowledge into propagation layers.

As biomedical networks can be multimodal and massive, special consideration is needed to scale GNNs to large and heterogeneous networks. To this end, [ 85 , 86 ] developed sampling strategies to intelligently select small subsets of the whole local network neighborhoods and use them in training GNN models. To tackle heterogeneous relations, [ 76 , 87 , 88 ] designed aggregation transformations to fuse diverse types of relations and attributes. Recent architectures describe dynamic message passing [ 76 , 89 , 90 ] to deal with evolving and time-varying graphs and few-shot learning [ 91 ] or self-supervised strategies [ 92 , 93 ] to deal with graphs that are poorly annotated and have limited label information.

2.3. Generative graph models

Generative graph models generate new node and edge structures—even entire graphs—that are likely to have desired properties, such as novel molecules with acceptable toxicity profiles ( Figure 2 ). Traditionally, network science models can generate graphs using deterministic or probabilistic rules. For instance, the Erdös-Rényi model [ 94 ] keeps iteratively adding random edges according to a predefined probability starting from an empty graph; the Barabási-Albert model [ 95 ] grows a graph by adding nodes and edges such that the resulting graph has a power-law degree distribution often observed in real-world networks; the configuration model [ 96 ] adds edges based on a predefined node degree sequences to generate graphs with arbitrary degree distributions. While powerful as random graph generators, such models cannot optimize graph structure based on properties of interest.

Deep generative models address the challenge by estimating distributional graph properties based on a dataset of graphs 𝒢 and inferring graph structures using such optimized distributions. In particular, a generation graph model first learns a latent distribution P ( Z ∣ 𝒢 ) that characterizes the input graph set 𝒢 . Then, conditioned on this distribution, it decodes a new graph, i.e., generates a new graph G ^ . There are different ways to encode the input graphs and learn the latent distribution, such as through variational autoencoders [ 64 , 97 , 98 ] and generative adversarial networks [ 99 ]. Decoding to generate a novel graph presents a unique challenge compared to an image or text since a graph is discrete, is unbounded in structure and size, and has no particular ordering of nodes. Common practices to generate new graphs include (1) predicting a probabilistic fully-connected graph and then using graph matching to find the optimal subgraph [ 100 ]; (2) decomposing a graph into a tree of subgraphs structure and generating a tree structure instead, followed by assembles of subgraphs) [ 64 ]; and (3) sequentially sampling new nodes and edges [ 63 , 65 ].

3. Application areas in biology and medicine

Biomedical data involve rich multimodal and heterogeneous interactions that span from molecular to societal levels ( Figure 3 ). Unlike machine learning approaches designed to analyze data modalities like medical images and biological sequences, graph representation learning methods are uniquely able to leverage structural information in multimodal datasets [ 101 ].

An external file that holds a picture, illustration, etc.
Object name is nihms-1946278-f0003.jpg

Networks are prevalent across biomedical areas, from the molecular level to the healthcare systems level. Protein structures and therapeutic compounds can be modeled as a network where nodes represent atoms and edges indicate a bond between pairs of atoms. Protein interaction networks contain nodes that represent proteins and edges that indicate physical interactions (top left). Drug interaction networks are comprised of drug nodes connected by synergistic or antagonistic relationships (bottom left). Protein- and drug-interaction networks can be combined using an edge type that signifies a protein being a “target” of a drug (left). Disease association networks often contain disease nodes with edges representing co-morbidity (middle). Edges exist between proteins and diseases to indicate proteins (or genes) associated with a disease (top middle). Edges exist between drugs and diseases to signify drugs that are indicated for a disease (bottom middle). Patient-specific data, such as medical images (e.g., spatial networks of cells, tumors, and lymph nodes) and EHRs (e.g., networks of medical codes and concepts generated by co-occurrences in patients’ records), are often integrated into a cross-domain knowledge graph of proteins, drugs, and diseases (right). With such vast and diverse biomedical networks, we can derive fundamental insights about biology and medicine while enabling personalized representations of patients for precision medicine. Note that there are many other types of edge relations; “targets,” “is associated with,” “is indicated for,” and “has phenotype” are a few examples.

Starting at the molecular level ( Section 4 ), molecular structure is translated from atoms and bonds into nodes and edges, respectively. Physical interactions or functional relationships between proteins also naturally form a network. Given by the key organizing principles that govern network medicine—for instance, the local hypothesis and the shared components hypothesis [ 13 ]—whether an unknown protein clusters in a particular neighborhood of and shares direct neighbors with known proteins are informative of its binding affinity, function, etc [ 102 ]. Grounded in network medicine principles, graph machine learning is commonly applied to learn molecular representations of proteins and their physical interactions for predicting protein function.

At the genomic level ( Section 5 ), genetic elements are incorporated into networks by extracting coding genes’ co-expression information from transcriptomic data. Single-cell and spatial molecular profiling have further enabled the mapping of genetic interactions at the cellular and tissue level. Investigating the cellular circuitry of molecular functions in the resulting gene co-expression or regulatory networks can uncover disease mechanisms. For instance, as implicated by the network parsimony principle [ 13 ], shortest path length in a molecular network between disease-associated components often correlates with causal molecular pathways [ 16 , 45 ]. Learned embeddings—using graph representation learning methods—that capture genome-wide interactions has enhanced disease predictions, even at the resolution of tissues and single cells.

At the therapeutics level ( Section 6 ), networks are composed of drugs (e.g., small compounds), proteins, and diseases to allow the modeling of drug-drug interactions, binding of drugs to target proteins, and identification of drug repurposing opportunities. For example, by the corollary of the local hypothesis [ 13 ], the topology of drug combinations are indicative of synergistic or antagonistic relationships [ 103 ]. Learning the topology of graphs containing drug, protein, and disease nodes has improved predictions of candidate drugs for treating a disease, identification of potential off-target effects, and prioritization of novel drug combinations.

Finally, at the healthcare level ( Section 7 ), patient records, such as medical images and EHRs, can be represented as networks and incorporated into protein, disease, and drug networks. As an example, given by three network medicine principles, the local hypothesis, the shared components hypothesis, and the disease module hypothesis [ 13 ], rare disease patients with common neighbors and topology likely have similar phenotypes and even disease mechanisms [ 104 , 105 ]. Graph representation learning methods have been successful in integrating patient records with molecular, genomic, and disease networks to personalize patient predictions.

4. Graph representation learning for molecules

Graph representation learning has been widely used for predicting protein interactions and function [ 106 , 107 ]. Specifically, the inductive ability of graph convolution neural networks to generalize to data points unseen during model training ( Section 2.2 ) and even generate new data points from scratch by decoding latent representation from the embedding space ( Section 2.3 ) has enabled the discovery of new molecules, interactions, and functions [ 101 , 108 , 109 ].

4.1. Modeling protein molecular graphs

Computationally elucidating protein structure has been an ongoing challenge for decades [ 33 ]. Since protein structures are folded into complex 3D structures, it is natural to represent them as graphs. For example, we construct a contact distance graph where nodes are individual residues and edges are determined by a physical distance threshold [ 110 ]. Edges can also be defined by the ordering of amino acids in the primary sequence [ 110 ]. Spatial relationships between residues (e.g., distances, angles) may be used as features for edges [ 111 ].

We then model the 3D protein structures by capturing dependencies in their sequences of amino acids (e.g., applying GNNs ( Section 2.2 ) to learn each node’s local neighborhood structure) in order to generate protein embeddings [ 111 , 112 ]. After learning short- and long-range dependencies across sequences corresponding to their 3D structures to produce embeddings of proteins, we can predict primary sequences from 3D structures [ 112 ]. Alternatively, we can use a hierarchical process of learning atom connectivity motifs to capture molecular structure at varying levels of granularity (e.g., at the motif-, connectivity-, and atomic-levels) in our protein embeddings, with which we can generate new 3D structures—a difficult task due to computational constraints of being both generalizable across different classes of molecules and flexible to a wide range of sizes [ 113 ]. As the field of machine learning for molecules is vast, we refer readers to existing reviews on molecular design [ 33 , 114 ], graph generation [ 115 ], and molecular properties prediction [ 33 , 34 ], and Section 6.1 on therapeutic compound design and generation.

4.2. Quantifying protein interactions

Many have integrated various data modalities, including chemical structure, binding affinities, physical and chemical principles, and amino acid sequences, to improve protein interaction quantification [ 33 ]. GNNs (as described in Section 4.1 ) are commonly used to generate representations of proteins based on chemical (e.g., free electrons’ and protons donors’ locations) and geometric (e.g., distance-dependent curvature) features to predict protein pocket-ligand and protein-protein interactions sites [ 116 ]; intra- and inter-molecule residue contact graphs to predict intra- and inter-molecular energies, binding affinities, and quality measures for a pair of molecular complexes [ 117 ]; and ligand- and receptor-protein graphs to predict whether a pair of residues from the ligand and receptor proteins is a part of an interface [ 111 ]. Combining evolutionary, topological, and energetic information about molecules enables the scoring of docked conformations based on the similarity of random walks simulated on a pair of protein graphs (refer to graph kernel metrics in SI Note 3 ) [ 52 ].

Due to experimental and resource constraints, the most updated PPI networks are still limited in their number of nodes (proteins) and edges (physical interactions) [ 118 ]. Topology-based methods have been shown to capture and leverage the dynamics of biological systems to enrich existing PPI networks [ 119 ]. Predominant methods first apply graph convolutions ( Section 2.2 ) to aggregate structural information in the graphs of interest (e.g., PPI networks, ligand-receptor networks), then use sequence modeling to learn the dependencies in amino acid sequences, and finally concatenate the two outputs for predicting the presence of physical interactions [ 101 , 120 ]. Interestingly, such concatenated outputs have been treated as “image” inputs to CNNs [ 120 ], demonstrating the synergy of graph- and non-graph based machine learning methods. Similar graph convolution methods are also used to remove less credible interactions, thereby constructing a more reliable PPI network [ 121 ].

4.3. Interpreting protein functions and cellular phenotypes

Characterizing a protein’s function in specific biological contexts is a challenging and experimentally intensive task [ 122 , 123 ]. However, innovating graph representation learning techniques to represent protein structures and interactions has facilitated protein function prediction [ 124 ], especially when we leverage existing gene ontologies and transcriptomic data.

Gene Ontology (GO) terms [ 125 ] are a standardized vocabulary for describing molecular functions, biological processes, and cellular locations of gene products [ 126 ]. They have been built as a hierarchical graph that GNNs then leverage to learn dependencies of the terms [ 126 ], or directly used as protein function labels [ 106 , 127 ]. In the latter case, we typically construct sequence similarity networks, combine them with PPI networks, and integrate protein features (e.g. amino acid sequence, protein domains, subcellular location, gene expression profiles) to predict protein function [ 106 , 127 ]. Others have even created gene interaction networks using transcriptomic data [ 108 , 128 ] to capture context-specific interactions between genes, which PPI networks lack.

Alternative graph representation learning methods for predicting protein function include defining diffusion-based distance metrics on PPI networks for predicting protein function [ 129 ]; using the theory of topological persistence to compute signatures of a protein based on its 3D structure [ 130 ]; and applying topological data analysis to extract features from protein contact networks created from 3D coordinates [ 131 ] ( SI Note 3 ). Many have also adopted an attention mechanism for protein sequence embeddings generated by a Bidirectional Encoder Representations from Transformers (BERT) model to enable interpretability [ 132 , 133 ], showcasing the synergy of graph-based and language models.

5. Graph representation learning for genomics

Diseases are classified based on the presenting symptoms of patients, which can be caused by molecular dysfunctions, such as genetic mutations. As a result, diagnosing diseases requires knowledge about alterations in the transcription of coding genes to capture genome-wide associations driving disease acquisition and progression. Graph representation learning methods allow us to analyze heterogeneous networks of multimodal data and make predictions across domains, from genomic level data (e.g., gene expression, copy number information) to clinically relevant data (e.g., pathophysiology, tissue localization).

5.1. Leveraging gene expression measurements

Comparing transcriptomic profiles from healthy individuals to those of patients with a specific disease informs clinicians of its causal genes. As gene expression is the direct readout of perturbation effects, changes in gene expression are often used to model disease-specific co-expression or regulatory interactions between genes. Further, injecting gene expression data into PPI networks has identified disease biomarkers, which are then used to more accurately classify diseases of interest.

Methods that rely solely on gene expression data typically transform the co-expression matrix into a more topologically meaningful form [ 146 , 147 , 148 ]. Gene expression data can be transformed into a colored graph that captures the shape of the data (e.g., using TDA [ 148 ]; refer to SI Note 3 ), which then enables downstream analysis using network science metrics and graph machine learning. Topological landscapes present in gene expression data can be vectorized and fed into a GCN to classify the disease type [ 147 ]. Alternatively, gene expression data can directly be used to construct disease and gene networks that are then input into a joint matrix factorization and GCN method to draw disease-gene associations, akin to a recommendation task ( Section 2.2 ) [ 146 ]. Further, applying a joint GCN, VAE, and GAN framework ( Section 2.3 ) to gene correlation networks—initialized with a subset of gene expression matrices—can generate disease networks with the desired properties [ 149 ].

Because gene expression data can be noisy and variational, recent advances include fusing the co-expression matrices with existing biomedical networks, such as GO annotations and PPI, and feeding the resulting graph into graph convolution layers ( Section 2.2 ) [ 150 , 151 , 152 ]. Doing so has enabled more interpretable disease classification models (e.g., weighting gene interactions based on existing biological knowledge). However, despite the utility of PPI networks, they have been reported to limit models trained solely on PPI networks because they are unable to capture all gene regulatory activities [ 153 ]. To this end, graph representation learning methods, such as GNNs, have been developed to learn robust and meaningful representations of molecules despite the incomplete interactome [ 102 ] and to inductively infer new edges between pairs of nodes [ 154 ].

5.2. Injecting single cell and spatial information into molecular networks

Single-cell RNA sequencing (scRNA-seq) data lend themselves to graph representation learning to model cellular differential processes [ 136 , 155 ] and disease states [ 140 ]. In particular, a predominant approach to analyze scRNA-seq datasets is to transform them into gene similarity networks, such as gene co-expression networks, or cell similarity networks by correlating gene expression readouts across individual cells. Applied to such networks, graph representation learning can impute scRNA-seq data [ 156 , 157 ], predict cell clusters [ 157 , 158 ], etc. Cell similarity graphs have also been created using autoencoders by first embedding gene expression readouts and then connecting genes based on how similar their embeddings are [ 157 ]. Alternatively, variational graph autoencoders produce cell embeddings and interpretable attention weights indicating what genes the model attends to when deriving an embedding for a given cell [ 159 ]. Beyond GNNs and graph autoencoders, learning a manifold over a cell state space can quantify the effects of experimental perturbations [ 155 ]. To this end, cell similarity graphs are constructed for control and treated samples and used to estimate the likelihood of a cell population observed under a given perturbation [ 155 ].

Spatial molecular profiling can measure both gene expression at the cellular level and location of cells in a tissue [ 160 ]. As a result, spatial transcriptomics data can be used to construct cell graphs [ 161 ], spatial gene expression graphs [ 162 ], gene co-expression networks, or molecular similarity graphs [ 35 ]. Creating cell neighborhood and spatial gene expression graphs require a distance metric, as edges are determined based on spatial proximity, while gene co-expression and molecular similarity graphs need a threshold applied on the gene expression data [ 35 ]. From such networks, graph representation learning methods produce embeddings that capture the network topology and that can be further optimized for downstream tasks. For instance, a cell neighborhood graph and a gene pair expression matrix enable classic GNNs to predict ligand-receptor interactions [ 161 ]. In fact, as ligand-receptor interactions are directed, they could be used to infer causal interactions of previously unknown ligand-receptor pairs [ 161 , 163 ].

6. Graph representation learning for therapeutics

Modern drug discovery requires elucidating a candidate drug’s chemical structure, identifying its drug targets, quantifying its efficacy and toxicity, and detecting its potential side effects [ 13 , 14 , 32 , 164 ]. Because such processes are costly and time-consuming, in silico approaches have been adopted into the drug discovery pipeline. However, cross-domain expertise is necessary to develop a drug with the optimal binding affinity and specificity to biomarkers, maximal therapy efficacy, and minimal adverse effects. As a result, it is critical to integrate chemical structure information, protein interactions, and clinically relevant data (e.g., indications and reported side effects) into predictive models for drug discovery and repurposing. Graph representation learning has been successful in characterizing drugs at the systems level without patient data to make predictions about interactions with other drugs, protein targets, side effects, diseases [ 6 , 38 , 39 , 40 , 48 , 165 ].

6.1. Modeling compound molecular graphs

Similar to proteins, small compounds are modeled as 2D and 3D molecular graphs such that nodes are atoms and edges are bonds. Each atom and bond may include features, such as atomic mass, atomic number, and bond type, to be included in the model [ 83 , 166 ]. Edges can also be added to indicate pairwise spatial distance between two atoms [ 72 ], or directed with information on bond angles and rotations incorporated into the molecular graph [ 84 ].

Representing molecules as graphs has improved predictions on various quantum chemistry properties. Intuitively, message passing steps (i.e., in GNNs ( Section 2.2 )) aggregate information from neighboring atoms and bonds to learn the local chemistry of each atom [ 166 ]. For example, generating representations of the atoms, distances, and angles to be propagated along the molecular graph has allowed us to identify the angle and direction of interactions between atoms [ 84 ]. Producing atom-centered representations based on a weighted combination of their neighbors’ features (i.e., using an attention mechanism) is able to model interactions among reactants for predicting organic reaction outcomes [ 167 ]. Alternatively, molecular graphs have been decomposed into a “junction tree,” where each node represents a substructure in the molecule, to learn representations of both the molecular graph and the junction tree for generating new molecules with desirable properties ( Section 2.3 ) [ 64 ]. Due to the major challenge of finding novel and diverse compounds with specific chemical properties, iteratively editing fragments of a molecular graph during training has improved predictions for high-quality candidates targeting our proteins of interest [ 168 ].

6.2. Quantifying drug-drug and drug-target interactions

Corresponding to molecular structure is binding affinity and specificity to biomarkers. Such measurements are important for ensuring that a drug is effective in treating its intended disease, and does not have significant off-target effects [ 34 ]. However, quantifying these metrics requires labor- and cost-intensive experiments [ 33 , 34 ]. Modeling small compounds’ and protein targets’ molecular structure as well as their binding affinities and specificity, for instance, using graph representation learning has enabled accelerated investigation of interactions between a given drug and protein target.

First, we learn representations of drugs and targets using graph-based methods, such as TDA [ 169 ] or shallow network embedding approaches [ 56 ]. Concretely, TDA (refer to SI Note 3 ) transforms experimental data into a graph where nodes represent compounds and edges indicate a level of similarity between them [ 169 ]. Shallow network embedding techniques are also used to generate embeddings for drugs and targets by computing drug-drug, drug-target, and target-target similarities ( Section 2.1 ) [ 170 ]. Non-graph based methods have also been used to construct graphs that are then fed into a graph representation learning model to generate embeddings. K-nearest neighbors, for instance, is a common used method to construct drug and target similarity networks [ 171 ]. The resulting embeddings are fed into downstream machine learning models.

Fusing compound sequence, structure, and clinical implications has significantly improved drug-drug and drug-target interaction predictions. For example, attention mechanisms have been applied on drug graphs, with chemical structures and side effects as features, to generate interpretable predictions of drug-drug interactions [ 172 ]. Also, two separate GNNs may be used to learn representations of protein and small molecule graphs for predicting drug-target affinity [ 173 ]. To be flexible with other graph- and non-graph-based methods, protein structure representations generated by graph convolutions has been combined with protein sequence representations (e.g., shallow network embedding methods or CNNs) to predict the probability of compound-protein interactions [ 174 , 175 , 176 , 177 ].

6.3. Identifying drug-disease associations and biomarkers for complex disease

Part of the drug discovery pipeline is minimizing adverse drug events [ 33 , 34 ]. But, in addition to high financial cost, the experiments required to measure drug-drug interactions and toxicity face a combinatorial explosion problem [ 33 ]. Graph representation learning methods enables in silico modeling of drug action, which allows for more efficient ranking of candidate drugs for repurposing, such as by considering gene expression data, gene ontologies, drug similarity, and other clinically relevant data regarding side effects and indications.

Drug and disease representations have been learned on homogeneous graphs of drugs, diseases, or targets. For instance, Medical Subject Headings (MeSH) terms may be used to construct a drug-disease graph, from which latent representations of drugs and diseases are learned using various graph embedding algorithms, including DeepWalk and LINE ( Section 2.1 ) [ 178 ]. TDA (refer to SI Note 3 ) has also been applied to construct graphs of drugs, targets, and diseases separately, from which representations of such entities are learned and optimized for downstream prediction [ 179 ].

To emphasize the systems-level complexity of diseases, recent methods fuse multimodal data to generate heterogeneous graphs. For example, neighborhood information are aggregated from heterogeneous networks comprised of drug, target, and disease information to predict drug-target interactions ( Section 2.2 ) [ 180 ]. In other instances, PPI networks are combined with genomic features to predict drug sensitivity using GNNs [ 181 ]. As a result, approaches that integrate cross-domain knowledge as a vast heterogeneous network and/or into the model’s architecture seem better equipped to elucidate drug action.

7. Graph representation learning for healthcare

Graph representation learning has been used to fuse multimodal knowledge with patient records to better enable precision medicine. Two modes of patient data successfully integrated using deep graph learning are histopathological images [ 8 , 185 ] and EHRs [ 186 , 187 ].

7.1. Leveraging networks for diagnostic imaging

Medical images of patients, including histopathology slides, enable clinicians to comprehensively observe the effects of a disease on the patient’s body [ 188 ]. Medical images, such as large whole histopathology slides, are typically converted into cell spatial graphs, where nodes represent cells in the image and edges indicate that a pair of cells is adjacent in space. Deep graph learning has been shown to detect subtle signs of disease progression in the images while integrating other modalities (e.g., tissue localization [ 189 ] and genomic features [ 8 ]) to improve medical image processing.

Cell-tissue graphs generated from histopathological images are able to encode the spatial context of cells and tissues for a given patient. Cell morphology and tissue micro-architecture information can be aggregated from cell graphs to grade cancer histology images (e.g., using GNNs ( Section 2.2 )) [ 8 , 190 , 191 , 192 ]. An example aggregation method is pooling with an attention mechanism to infer relevant patches in the image [ 190 ]. Further, cell morphology and interactions, tissue morphology and spatial distribution, cell-to-tissue hierarchies, and spatial distribution of cells with respect to tissues can be captured in a cell-to-tissue graph, upon which a hierarchical GNN can learn representations using these different data modalities [ 189 ]. Because interpretability is critical for models aimed to generate patient-specific predictions, post-hoc graph pruning optimization may be performed on a cell graph generated from a histopathology image to define subgraphs explaining the original cell graph analysis [ 193 ].

Graph representation learning methods have also been proven successful for classifying other types of medical images. GNNs are able to model relationships between lymph nodes to compute the spread of lymph node gross tumor volume based on radiotherapy CT images ( Section 2.2 ) [ 194 ]. MRI images can be converted into graphs that GNNs are applied to for classifying the progression of Alzheimer’s Disease [ 195 , 196 , 197 ]. GNNs are also shown to leverage relational structures like similarities among chest X-rays to improve downstream tasks, such as disease diagnosis and localization [ 198 ]. Alternatively, TDA can generate graphs of whole-slide images, which include tissues from various patient sources ( SI Note 3 ), and GNNs are then used to classify the stage of colon cancer [ 199 ].

Further, spatial molecular profiling benefits from methodological advancements made for medical images. With spatial gene expression graphs (weighted and undirected) and corresponding histopathology images, gene expression information are aggregated to generate embeddings of genes that could then be used to investigate spatial domains (e.g., differentiate between cancer and noncancer regions in tissues) [ 200 ]. Since multimodal data enables more robust predictions, GNNs ( Section 2.2 ) have been applied to generate cell spatial graphs from histopathology images and then fuse genomic and transcriptomic data for predicting treatment response and resistance, histopathology grading, and patient survival [ 8 ].

7.2. Personalizing medical knowledge networks with patient records

Electronic health records are typically represented by ICD (International Classification of Disease) codes [ 186 , 187 ]. The hierarchical information inherent to ICD codes (medical ontologies) naturally lend itself to creating a rich network of medical knowledge. In addition to ICD codes, medical knowledge can take the form of other data types, including presenting symptoms, molecular data, drug interactions, and side effects. By integrating patient records into our networks, graph representation learning is well-equipped to advance precision medicine by generating predictions tailored to individual patients.

Methods that embed medical entities, including EHRs and medical ontologies, leverage the inherently hierarchical structure in the medical concepts KG [ 201 ]. Low dimensional embeddings of EHR data can be generated by separately considering medical services, doctors, and patients in shallow network embeddings ( Section 2.1 ) and graph neural networks ( Section 2.2 ) [ 202 , 203 ]. Alternatively, attention mechanisms may be applied on EHR data and medical ontologies to capture the parent-child relationships [ 186 , 204 , 205 ]. Rather than assuming a certain structure in the EHRs, a Graph Convolution Transformer can even learn the hidden EHR structure [ 79 ].

EHRs also have underlying spatial and/or temporal dependencies [ 206 ] that many methods have recently taken advantage of to perform time-dependent prediction tasks. A mixed pooling multi-view self-attention autoencoder has generated patient representations for predicting either a patient’s risk of developing a disease in a future visit, or the diagnostic codes of the next visit [ 207 ]. A combined LSTM and GNN model has also been used to represent patient status sequences and temporal medical event graphs, respectively, to predict future prescriptions or disease codes ( Section 2.2 ) [ 208 , 209 ]. Alternatively, a patient graph may be constructed based on the similarity of patients, and patient embeddings learned by an LSTM-GNN architecture are optimized to predict patient outcomes [ 210 ]. An ST-GCN [ 78 ] is designed to utilize the underlying spatial and temporal dependencies of EHR data for generating patient diagnoses [ 187 ].

EHRs are often supplemented with other modalities, such as diseases, symptoms, molecular data, drug interactions, etc [ 7 , 206 , 211 , 212 ]. A probabilistic KG of EHR data, which include medical history, drug prescriptions, and laboratory examination results, has been used to consider the semantic relations between EHR entities in a shallow network embedding method ( Section 2.1 ) [ 213 ]. Meta-paths may alternatively be exploited in an EHR-derived KG to leverage higher order, semantically important relations for disease classification [ 214 ]. Node features for drugs and diseases can be initialized using Skipgram and then a GNN leveraging multi-layer message-passing can be applied to predict adverse drug events [ 211 ]. Moreover, combined RNNs and GNNs models have been applied to EHR data integrated with drug and disease interactions to better recommend medication combinations [ 215 ].

As graph representation learning has aided in the mapping of genotypes to phenotypes, leveraging graph representation learning for fine-scale mapping of variants is a promising new direction [ 217 ]. By re-imagining GWAS and expression Quantitative Trait Loci (eQTL) studies [ 218 ] as networks, we can already begin to discover biologically meaningful modules to highlight key genes involved in the underlying mechanisms of a disease [ 219 ]. We can alternatively seed network propagation with QTL candidate genes [ 217 ]. Additionally, because graphs can model long-range dependencies or interactions, we can model chromatin elements and the effects of their binding to regions across the genome as a network [ 220 , 221 ]. We could even reconstruct 3D chromosomal structures by predicting 3D coordinates for nodes derived from a Hi-C contact map [ 222 ]. Further, as spatial molecular profiling has enabled profound discoveries for diseases, graph representation learning repertoire to analyze such datasets will continue to expand. For instance, with dynamic GNNs, we may be able to better capture changes in expression levels observed in single cell RNA sequencing data over time or as a result of a perturbation [ 155 , 223 ].

Effective integration of healthcare data with knowledge about molecular, genomic, disease-level, and drug-level data can help generate more accurate and interpretable predictions about the biological systems of health and disease [ 224 ]. Given the utility of graphs in both the biological and medical domains, there has been a major push to generate knowledge graphs that synthesize and model multi-scaled, multi-modal data, from genotype-phenotype associations to population-scale epidemiological dynamics. In public health, spatial and temporal networks can model space- and time-dependent observations (e.g., disease states, susceptibility to infection [ 225 ]) to spot trends, detect anomalies, and interpret temporal dynamics.

As artificial intelligence tools implementing graph representation learning algorithms are increasingly employed in clinical applications, it is essential to ensure that representations are explainable [ 226 ], fair [ 227 ], and robust [ 228 ], and that existing tools are revisited in light of algorithmic bias and health disparities [ 229 ].

An external file that holds a picture, illustration, etc.
Object name is nihms-1946278-f0001.jpg

Given a biomedical network, a representation learning method transforms the graph to extract patterns and leverage them to produce compact vector representations that can be optimized for the downstream task. The far right panel shows a local 2-hop neighborhood around node u , illustrating how information (e.g., neural messages) can be propagated along edges in the neighborhood, transformed, and finally aggregated at node u to arrive at the u ’s embedding.

An external file that holds a picture, illustration, etc.
Object name is nihms-1946278-f0004.jpg

We present a case study on (a) cell-type aware protein representation learning via multilabel node classification (details in Box 2 ), (b) disease classification using subgraphs (details in Box 3 ), (c) cell-line specific prediction of interacting drug pairs via edge regression with transfer learning across cell lines (details in Box 4 ), and (d) integration of health data into knowledge graphs to predict patient diagnoses or treatments via edge regression (details in Box 5 ).

Learning multi-scale representations of proteins and cell types ( Figure 4a )

Graph dataset..

Activation of gene products can vary considerably across cells. Single-cell transcriptomic and proteomic data captures the heterogeneity of gene expression across diverse types of cells [ 134 , 135 ]. With the help of GNNs, we inject cell type specific expression information into our construction of cell type specific gene interaction networks [ 136 , 137 , 138 ]. To do so, we need a global protein interaction network [ 118 , 139 ].

Learning task.

On a global gene interaction network, we perform multilabel node classification to predict whether a gene is activated in a specific cell type based on single-cell RNA sequencing (scRNA-seq) experiments. In particular, if there are N cell types identified in a given experiment, each gene is associated with a vector of length N . Given the gene interaction network and label vectors for a select number of genes, the task is to train a model that predicts every element of the vector for a new gene such that predicted values indicate the probabilities of gene activation in various cell types ( Figure 4a ). To enable inductive learning, we split our nodes (i.e., genes) into train, validation, and test sets such that we can generalize to never-before-seen genes.

Generating gene embeddings that consider differential expression at the cell type level enables predictions at a single cell resolution, with considerations for factors including disease/cell states and temporal/spatial dependencies [ 136 , 140 ]. Implications of such cell-type aware gene embeddings extend to cellular function prediction and identification of cell-type-specific disease features [ 138 ]. For example, quantifying ligand-receptor interactions using single cell expression data has elucidated intercellular interactions in tumor microenvironments (e.g., via CellPhoneDB [ 141 ] or NicheNet [ 142 ]). In fact, upon experimental validation of the predicted cell-cell interactions in distinct spatial regions of tissues and/or tumors, these studies have demonstrated the importance of spatial heterogeneity in tumors [ 143 ]. Further, unlike most non-graph based methods, like autoencoders, GNNs are able to model dependencies (e.g., physical interactions) between proteins as well as single-cell expression [ 144 , 145 ].

Learning representations of diseases and phenotypes ( Figure 4b )

Symptoms are observable characteristics that typically result from interactions between genotypes. Physicians utilize a standardized vocabulary of symptoms, i.e., phenotypes, to describe human diseases. Thus, we model diseases as collections of associated phenotypes to diagnose patients based on their presenting symptoms. Consider a graph built from the standardized vocabulary of phenotypes, e.g., the Human Phenotype Ontology [ 3 ] (HPO). The HPO forms a directed acyclic graph with nodes representing phenotypes and edges indicating hierarchical relationships between them; however, it is typically treated as an undirected graph in most implementations of GNNs. A disease described by a set of its phenotypes corresponds to a subset of nodes in the HPO, forming a subgraph of the HPO. Note that a subgraph can contain many disconnect components dispersed across the entire graph [ 104 ].

Given a dataset of HPO subgraphs and disease labels for a select number of them, the task is to generate an embedding for every subgraph and use the learned subgraph embeddings to predict the disease most consistent with the set of phenotypes that the embedding represents [ 104 ] ( Figure 4b ).

Modeling diseases as rich graph structures, such as subgraphs, enables a more flexible representation of diseases than relying on individual nodes or edges. As a result, we can better resolve complex phenotypic relationships and improve differentiation of diseases or disorders.

Learning representations of drugs and drug combinations ( Figure 4c )

Combination therapies are increasingly used to treat complex and chronic diseases. However, it is experimentally intensive and costly to evaluate whether two or more drugs interact with each other and lead to effects that go beyond additive effects of individual drugs in the combination. Graph representation learning has been equipped to leverage perturbation experiments performed across cell lines to predict the response of never-before-seen cell lines with mutation(s) of interest (e.g., disease-causing) to drug combinations. Consider a multimodal network of protein-protein, protein-drug, and drug-drug interactions where nodes are proteins and drugs, and edges of different types indicate physical contacts between proteins, the binding of drugs to their target proteins, and interactions between drugs (e.g., synergistic effects, where the effects of the combination are larger than the sum of each drug’s individual effect) [ 182 , 183 ]. Such a multimodal drug-protein network is constructed for every cell line, yielding a collection of cell line specific networks ( Figure 4c ) [ 183 ].

From a single cell line’s drug-protein network, we predict whether two or more drugs are interacting in the cell line [ 183 ]. Concretely, we embed nodes of a drug-protein network into a compact embedding space such that distances between node embeddings correspond to similarities of nodes’ local neighborhoods in the network. We then use the learned embeddings to decode drug-drug edges, and predict probabilities of two drugs interacting based on their embeddings. Next, we apply transfer learning to leverage the knowledge gained from one cell line specific network to accelerate the training and improve the accuracy across other cell line specific networks ( Figure 4c ) [ 184 ]. Specifically, we develop a model using one cell line’s drug-protein network, “reuse” the model on the next cell line’s drug-protein network, and repeat until we have trained on drug-protein networks from all cell lines.

Not only are non-graph based methods unsuited to capture topological dependencies between drugs and targets, most predictive models for drug combinations do not consider tissue or cell-line specificity of drugs. Because drugs’ effects on the body are not uniform, it is crucial to account for such anatomical differences. Further, the ability to prioritize candidate drug combinations in silico could reduce the cost of developing and testing them experimentally, thereby enabling robust evaluation of the most promising combinatorial therapies.

Fusing personalized health information with knowledge graphs ( Figure 4d )

To realize precision medicine, we need robust methods that can inject biomedical knowledge into patient-specific information to produce actionable and trustworthy predictions [ 216 ]. Since EHRs can also be represented by networks, we are able to fuse patients’ EHR networks with biomedical networks, thus enabling graph representation learning to make predictions on patient-specific features. Consider a knowledge graph, where nodes and edges represent different types of bioentities and their various relationships, respectively. Examples of relations may include “up-/down-regulate,” “treats,” “binds,” “encodes,” and “localizes” [ 7 ]. To integrate patients into the network, we create a distinct meta node to represent each patient, and add edges between the patient’s meta node and its associated bioentity nodes ( Figure 4d ).

Learning tasks.

We learn node embeddings for each patient while predicting (via edge regression) the probability of a patient developing a specific disease or of a drug effectively treating the patient ( Figure 4d ) [ 7 ].

Precision medicine requires an understanding of patient-specific data as well as the underlying biological mechanisms of disease and healthy states. Most networks do not consider patient data, which can prevent robust predictions of patients’ conditions and potential responsiveness to drugs. The ability to integrate patient data with biomedical knowledge can address such issues.

Supplementary Material

Supplementary information, acknowledgements.

We gratefully acknowledge the support of NSF under Nos. IIS-2030459 and IIS-2033384, US Air Force Contract No. FA8702-15-D-0001, Harvard Data Science Initiative, Amazon Research Award, Bayer Early Excellence in Science Award, AstraZeneca Research, and Roche Alliance with Distinguished Scientists Award. M.M.L. is supported by T32HG002295 from the National Human Genome Research Institute and a National Science Foundation Graduate Research Fellowship. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the funders. The authors declare that there are no conflicts of interests.

Graph Representation Learning for Interactive Biomolecule Systems

Content maybe subject to  copyright     Report

Related Papers (5)

Trending questions (1).

Graph representation learning is the process of learning a powerful and informative latent representation of graphs or nodes, which can be used for various downstream applications such as attribute predictions.

Ask Copilot

Related papers

Related topics

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 15 February 2024

Graph neural network recommendation algorithm based on improved dual tower model

  • Qiang He 1 ,
  • Xinkai Li 2 &
  • Biao Cai 2 , 3  

Scientific Reports volume  14 , Article number:  3853 ( 2024 ) Cite this article

114 Accesses

Metrics details

  • Mathematics and computing

In this era of information explosion, recommendation systems play a key role in helping users to uncover content of interest among massive amounts of information. Pursuing a breadth of recall while maintaining accuracy is a core challenge for current recommendation systems. In this paper, we propose a new recommendation algorithm model, the interactive higher-order dual tower (IHDT), which improves current models by adding interactivity and higher-order feature learning between the dual tower neural networks. A heterogeneous graph is constructed containing different types of nodes, such as users, items, and attributes, extracting richer feature representations through meta-paths. To achieve feature interaction, an interactive learning mechanism is introduced to inject relevant features between the user and project towers. Additionally, this method utilizes graph convolutional networks for higher-order feature learning, pooling the node embeddings of the twin towers to obtain enhanced end-user and item representations. IHDT was evaluated on the MovieLens dataset and outperformed multiple baseline methods. Ablation experiments verified the contribution of interactive learning and high-order GCN components.

Introduction

Recommendation systems 1 , 2 have an important role in helping users uncover interesting content from massive amounts of information in the context of the current information explosion era. Early methods were based on collaborative filtering 3 and mainly relied on matrix decomposition 4 , 5 , 6 , 7 , 8 , such as hybrid algorithms based on Probs and Heats calculation modes 6 . However, collaborative filtering is based on users’ historical behavioral information and cannot effectively model auxiliary information such as social relationships and product attributes. Subsequently, researchers incorporated content features to overcome the cold start problem 9 , 10 , 11 .

Most early research efforts focused on homogeneous networks composed of nodes and edges of the same type. For example, Perozzi et al. 12 proposed the deep walk model that combines random walk with the skip-gram model 13 . Subsequently, Grover et al. 14 proposed depth-first and breadth-first wandering strategies to capture different network structure information by improving the wandering strategy of deep walk; both strategies are used in the Node2Vec model. The LINE model proposed by Tang et al. 15 defines first-order and second-order similarities to learn the node representation of large-scale sparse networks. However, these three models are shallow, and the network representation they generate is not optimal because the captured network nodes are too close and full of local information. Graph neural network (GNN) models 16 , 17 , 18 , 19 , 20 , 21 , 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 , 33 , 34 , 35 , 36 , 37 , 38 , 39 have recently brought new opportunities to recommendation systems, which can explicitly model high-order user–product interactions to enrich expressions 16 , 17 . For example, Berg et al. 18 designed a recommendation method based on a graph autoencoder through message passing and aggregation. Wang et al. 19 proposed that a spatial GNN would be superior to traditional collaborative filtering, such as NCF 20 . Sun et al. 21 believed that a simple aggregation mechanism could not effectively utilize neighbor information, so they designed neighbor interactive aggregation.

The core challenge that current recommendation systems face is pursuing recall breadth without sacrificing accuracy. Due to its high efficiency in large-scale candidate record screening, the two-tower neural network model 40 , 41 , 42 , 43 , 44 has received widespread attention. However, its user and product tower are trained independently and cannot effectively model the interaction between features, which results in poor recommendation accuracy. Recently, GNN models have been successfully used in recommendation systems, achieving significant performance improvements through high-order feature interaction.

We propose an interactive high-order twin-tower model (IHDT) that considers the speed advantage of the twin-tower model and the accuracy advantage of GNNs. This model is built on a heterogeneous graph of user products and uses an interactive learning mechanism to inject product (user) features related to users (products) into the corresponding encoder. The model uses high-order feature expression based on graph convolutional networks to aggregate multipored neighbor information of nodes and enhance the vector representation of users and products. Finally, the inner product between the augmented representations is calculated as the predicted value of the recommendation system. The effectiveness of the model was verified on public datasets, and the results show that IHDT achieves state-of-the-art performance comparable to multiple powerful benchmarks. The IHDT model thus offers interactive modeling of user products and graph-based high-order feature learning, and as such provides new ideas for large-scale recommendation systems considering both accuracy and diversity.

Problem definition

Real-world data often contain multiple types of objects and their interactions, which makes it difficult to model the data using homogeneous information networks, as representing learning using a homogeneous information network captures only some features. However, these heterogeneous data can be naturally modeled using heterogeneous information networks, in which multiple types of objects and interactions can coexist, containing rich network structure information and semantic information. The relevant definitions are shown below.

Definition 1

Information Network . An information network is represented as \({\text{G}}=({\text{V}},{\text{E}},\mathrm{\varphi },\uppsi )\) , consisting of a set of objects V, a set of links E, a mapping function of object types \(\mathrm{\varphi }:{\text{V}}\to {\text{A}}\) , and a mapping function of relationship types \(\uppsi :{\text{E}}\to {\text{R}}\) . A is the set of object types, and R is the set of relationship types.

Definition 2

Homogeneous/Heterogeneous Information Network . An information network is heterogeneous if the number of object types is \(|{\text{A}}|>1\) or the number of relationship types is \(|{\text{R}}|>1\) . Otherwise, it is called a homogeneous information network.

A simple heterogeneous information network is shown in Fig.  1 . It contains three types of nodes: user, item, and brand, and the relationships between them.

figure 1

Heterogeneous information network.

Definition 3

Network Model . The network model 45 can be represented as \({{\text{T}}}_{{\text{G}}}=({\text{A}},{\text{R}})\) . It is a directed graph with object type A as nodes and relationship type R as edges.

Definition 4

Meta-path . A meta-path is a path of two classes of objects, A and R, in the network pattern \({{\text{T}}}_{{\text{G}}}=({\text{A}},{\text{R}})\) , which can be expressed as \({{\text{A}}}_{1}{\to }^{{{\text{R}}}_{1}}{{\text{A}}}_{2}{\to }^{{{\text{R}}}_{2}}...{{\to }^{{{\text{R}}}_{{\text{l}}}}{\text{A}}}_{{\text{l}}+1}\) , representing a composite relationship.

Definition 5

Recommendation Base Graph . Traversing nodes and edges in the i-th meta-path pattern forms a subgraph, \({G}_{i}\) , and finally merging all the obtained subgraphs to form the recommendation base graph G, i.e., \({\text{G}}=\bigcup {G}_{i}\) .

In the heterogeneous information network, as shown in Fig.  2 , different meta-paths are selected for user u 1 to obtain different higher-order connectivity.

figure 2

Two different higher-order connectivity of \({u}_{1}\) . ( a ) Higher-order connectivity based on user–item meta-paths of \({u}_{1}\) ; ( b ) Higher-order connectivity based on user–attribute–item meta-paths of \({u}_{1}\) .

The IHDT model (see Fig.  3 for the architecture diagram) is based on interactive, high-order learning mechanisms to improve the dual tower model and the accuracy of the recommendation system. The IHDT model introduced in this paper learns node representations and applies them to recommendations in the following steps.

Select different-pattern meta-paths \({\Phi }_{i}\) in the data source to form different-pattern subgraphs \({G}_{i}\) and finally merge all subgraphs to form the recommendation base graph \(G\) (i.e., \({\text{G}}=\bigcup {G}_{i}\) ).

Use random initialization or a pre-training model to obtain different node initialization expressions of \(G\) ; the i-th user node is expressed as \({e}_{u}\) , the k-th attribute of this user is expressed as \({e}_{u}^{k}\) , the j-th item is expressed as \({e}_{i}\) , and the h-th attribute of this item is expressed as \({e}_{i}^{h}\) .

Using the interaction learning mechanism, the interaction mechanism expressions \({a}_{u}\) and \({a}_{v}\) of the user and item nodes are derived as the two inputs of the dual tower model.

Higher-order mechanism expression learning based on GCN converges and aggregates the multifaceted representations of nodes to obtain the final representations \({e}_{u}^{*}\) and \({e}_{i}^{*}\) of users and items.

Finally, the inner product of \({e}_{u}^{*}\) and \({e}_{i}^{*}\) is calculated and used as the final prediction value of the model.

figure 3

Interactive higher-order dual tower recommendation model (IHDT).

The construction of the IHDT model, including implementing the interactive and high-order learning mechanisms, consists of the following steps.

Interactive learning

The framework of the proposed model is shown in Fig.  3 . Add user feature vector \({a}_{u}\) and product feature vector \(a_{v}\) to the user and product input terminals, respectively. The user and item embedding representations \(z_{u}^{i}\) , \(z_{v}^{j}\) \(\in {\mathbb{R}}^{1 \times d^{\prime}}\) under specific user–item interaction are learned through the interaction mechanism G. The connection layer passes \({z}_{u}\in {\mathbb{R}}^{{\text{nu}}\times d}\) and \({z}_{v}\) \(\in {\mathbb{R}}^{{\text{nv}}\times d}\) into the user tower and product tower, and outputs the vector representations of users and products \(e_{u}\) and \({\text{e}}_{i}\) :

where \(a_{u}\) and \(a_{v}\) are derived from the information captured by the interaction behaviors in the corpus.

High-order learning

The final output vectors \({e}_{u}\) and \({e}_{i}\) obtained in the dual tower model can only represent the first-order information of the current user (i.e., the rich feature information of the user (item)); the first-order interaction between the user and the item can be obtained in the dual tower model, but the convergence of higher-order information between the user and the item cannot be achieved. Using \({e}_{u}\) and \({e}_{i}\) as the input of GCN, first-order and high-order propagation aggregation rules are designed to obtain the final embedding representations of users and products \({{\text{e}}}_{{\text{u}}}^{*}\) and \({{\text{e}}}_{{\text{i}}}^{*}\) . GCNs can achieve this purpose by propagating the convergence, ensuring the richness of \({e}_{u}\) and \({e}_{i}\) as well as the accuracy brought by the prediction:

where \({\text{W}}_{1} ,{\text{W}}_{2} \in {\mathbb{R}}^{{d^{\prime} \times d}}\) is the trainable weight matrix used to extract useful propagation information, and \(d^{\prime}\) is the transformation dimension, while \(p_{ui}\) is set as the Laplacian matrix of the graph, and \({\mathcal{N}}_{u}\) and \({\mathcal{N}}_{i}\) denote the first-order neighbors of user u and item i , respectively.

On top of first-order propagation and convergence, higher-order propagation and convergence are embedded into the propagation layers by stacking l. Users (and items) can receive messages propagated from their L-Hop neighbors. Then, the final embedding representation of the final l-th layer can be obtained:

where \({\text{E}}^{\left( l \right)} \in {\mathbb{R}}^{{\left( {N + M} \right)d_{l} }}\) is the representation obtained after the users, and items are embedded in the propagation convergence l layer. \({\text{E}}^{\left( 0 \right)}\) is \({\text{E}} = \left[ {\underbrace {{{\text{e}}_{{{\text{u}}_{1} }} , \ldots ,{\text{e}}_{{{\text{u}}_{{\text{N}}} }} }}_{{\text{users embeddings}}},\;\underbrace {{{\text{e}}_{{{\text{i}}_{1} }} , \ldots ,{\text{e}}_{{{\text{i}}_{{\text{M}}} }} }}_{{\text{item embeddings}}}} \right]\) ; I is the unit matrix. \(\mathcal{L}\) represents the Laplacian matrix of the user–item bipartite graph.

Model prediction

After propagation and convergence through l layers, the embedding representation of each user (item) is obtained at each layer, and the final user \({\mathbf{e}}_{u}^{*}\) and item \({\mathbf{e}}_{i}^{*}\) embedding representations are obtained by a simple join operation:

Then, preference prediction is performed by computing the inner product of the final embedding representations of users and items:

Finally, model optimization is performed based on the Bayesian personalized ranking (BPR) loss:

where O represents sample set \({\text{u}},{\text{ I}}\) represents positive samples, and \(u,i^{ - }\) represents negative samples. Finally, the above BPR loss function can be minimized, the parameters in the whole model can be optimized end-to-end in the form of backpropagation, and all parameters converge to a fixed value with the optimization process of the model.

Overall process

According to the above, the overall training process of the IHDT model is as follows.

Table 4-1 Training process of IHDT.

figure a

Training process of IHDT

Experimental setup

Dataset description.

Experiments were conducted on two real-world datasets to evaluate the proposed IHDT model. The effectiveness of the IHDT model was evaluated by experiments conducted on two benchmark datasets, MovieLens-1M and MovieLens-10M. The MovieLens dataset was provided by the University of Minnesota’s Group Lens Project. The datasets are rated on a five-point scale, from 1 to 5, representing the user’s interest in the movie. They are all publicly desensitized and accessible and vary in domain, size, and sparsity. The statistical information of the datasets is shown in Table 1 .

For each dataset, 90% of the historical interactions of each user were randomly selected to form the training set, and the rest were used as the test set. Each pair of training and test sets is complementary, and recombining them can yield the initial dataset. From the training set, 10% of the interactions were randomly selected as the validation set to tune the hyperparameters. Each observed user–item interaction was treated as a positive instance. Then, a negative sampling strategy was used to pair it with a negative item the user had not used before.

Evaluation metrics

For each user in the test set, we considered all items with which the user had no interaction as negative samples and items that the user had interactions with as positive samples. We utilized four commonly used performance evaluation metrics: precision, recall, normalized discounted cumulative gain (NDCG), and hit rate (HR).

We briefly introduce these metrics as follows.

Precision@K is used to evaluate the proportion of products related to the user among the top K products recommended to the user. It is calculated as follows:

where \({d}_{i}(K)\) is the intersection of the top \(K\) products recommended to the user and the products in the test set that the user interacts with.

Recall@K is used to evaluate the proportion of the products that are related to the user to those that are recommended to the user:

NDCG@K is used to evaluate the accuracy of the ranking results. Assuming that the length of the recommendation list is \(K\) , NDCG@K shows the gap between the ranking list and the real user interaction list. It is calculated as follows:

where \({r}_{k}=1\) indicates that the K th item is the item that the user favors; otherwise, \({r}_{k}=0\) . \({Z}_{k}\) is a normalization constant.

HR@K is a commonly used indicator to measure recall rate and is calculated as follows:

where \(N\) is the total number of users, and \(hits(i)\) represents whether the value accessed by the i- th user is in the recommended list. If yes, then \(hits(i)\) equals 1; otherwise, it is 0.

The proposed IHDT was compared with the following top-k recommendation algorithms to demonstrate its effectiveness.

MF 4 . This is a matrix decomposition optimized by Bayesian personalized ranking (BPR) loss, which uses only the direct user–item interaction as the objective value of the interaction function.

DMF 46 . This method is a matrix decomposition model with a neural network architecture. A user–item matrix with explicit ratings and non-preference implicit feedback is constructed and used as input. A deep structure learning architecture is proposed to learn a generic low-dimensional space representing users and items.

GCMC 18 . This approach considers the matrix decomposition of recommendation systems from a link prediction perspective, represented by a bipartite user–item graph with labeled edges indicating the observed ratings. This way, a graph autoencoder framework is proposed based on micro-message transferable on bi-directional interaction graphs.

NeuMF 47 . This approach is an advanced neural CF model that uses multiple hidden layers above the element level and connections of user and item embeddings to capture their nonlinear feature interactions.

ConvNCF 48 . This method uses outer products to explicitly model pairwise correlations between embedding space dimensions. A more expressive and semantically sound 2-D interaction graph is obtained using the outer product on top of the embedding layer. On top of the interaction graph, a convolutional neural network (CNN) is used to learn higher-order correlations between the embedding dimensions.

NGCF 49 . This approach can explicitly encode the higher-order user–item interactions into the representation vector, effectively injecting collaborative signals into the embedding process in an explicit way to improve the representation and, thus, the overall recommendation.

LightGCN 50 This method removes the feature transformation and nonlinear activation in GCN, making the model more concise and efficient. LightGCN only retains the neighbor aggregation operation of GCN, that is, layer-by-layer propagation and aggregation through the user–item interaction matrix. At the same time, a new vertical regularization term is introduced to punish overly complex user and item representations to prevent overfitting. In terms of efficiency, a sparse graph is constructed by discarding some connections to speed up training and inference. Additionally, two dropping strategies, random dropping and weighted sampling, are introduced. LightGCN generally achieves the best balance of accuracy and efficiency by designing a simple model and regularization strategy. It retains the key mechanisms GCN requires while removing unnecessary components, which is its main advantage over other GCN models.

BM3* 51 This method is different from the previous method in that it is a multi-modal recommendation method. It designs a multi-modal contrastive loss function that simultaneously optimizes three goals: reconstructing user-item interaction graphs, aligning learned features between different modalities, and reducing the gap between different augmented view representations from a specific modality. dissimilarity. (*Note: The dataset we tested is a traditional recommendation system dataset which does not provide much additional information for multi-modal learning, so the effect may not ideal. This test is only to illustrate the multi-modal recommendation model does not have advantages for traditional recommendation tasks).

Parameter setting

This study implemented the IHDT model in TensorFlow and Pytorch. The parameters were randomly initialized in TensorFlow for the improved dual tower model, and the number of connected layers was set to two by default. The output vector in the dual tower model was used as the initialization parameter of IHDT in Pytorch. The embedding dimensions of the model default were set to 64, the normalization coefficient was set to 1e−5 by default, the learning rate was uniformly set to 1e−4, the node loss ratio was set to 0.1, the message loss ratio was set to 0.1, the number of layers was set to 3, the layers in the Movielens-1M dataset batch size were set to 1024, the epoch was set to 1000, the dataset batch size in the MovieLens-10M was set to 4096, and epoch was set to 400. Additionally, an early stop policy was set (i.e., if the evaluation indicator recall did not increase within 50 consecutive epochs, it was stopped early).

Experimental results and analysis

Comparison of experimental results.

Table 2 shows the results of Top-20 recommendations on MovieLens-1M and MovieLens-10M datasets for the IHDT and other benchmark algorithms described in this paper. The table shows that the IHDT algorithm outperformed the comparison algorithms in overall performance and was superior to the comparison algorithms for all indicators. The accuracy, recall, and NDCG were all improved.

Table 2 shows that the IHDT algorithm considerably improved over the benchmark algorithms. The IHDT and NGCF algorithms also performed better. This paper speculates that it may be due to the high sparsity of the MovieLens-10M dataset used in the study. Both the IHDT and NGCF models could aggregate the feature information of high-order neighbors. IHDT in the dual tower model also performed deep semantic interaction, so it still performed well with sparser data. MF and DMF were less effective on sparse datasets. Meanwhile, although NeuMF could also aggregate neighbor information, the converged information was insufficient for higher-order neighbors, resulting in moderate performance.

Analysis of recommendation list length impact

After completing the basic experiments to investigate the effect of different recommendation list lengths on model performance, we continued to compare the model’s performance with different recommendation list lengths on the dataset of the basic experiment. The results of this experiment, which increased the recommendation list length from 10 to 100 while keeping other variables constant, are as follows.

Figure  4 shows the performance trends of the IHDT model on Movielens-1M and MovielensLens-10M datasets for recommendation list lengths from 10 to 100 using line graphs. Based on the above experimental results, it was found that as the recommendation list length increased, both the recall indicator and NDCG indicator showed an increasing trend; the recall indicator increased significantly. The precision indicator decreased smoothly, indicating that the algorithm performed well when the recommendation list length was increased.

figure 4

Performance under different datasets and recommended list lengths.

Parameter sensitivity study

When the IHDT model was applied to a recommendation system, the main parameters included the number of model layers, the dimensionality d of the user and item representation vector (embedding), the learning rate (lr), and the L2 regularization coefficient. Taking the Movielens-1M dataset as an example, this study investigated the parameter variation of the IHDT algorithm with different parameter values.

Number of model layers

The method in this paper selects the 1st to Kth-order higher-order neighbor information for fusion. To verify the influence of the selected order on the experimental results, different orders were selected for the Movielen-1M dataset. The experimental results are shown in Table 3 .

The details in Table 3 reflect that when the model order is small, increasing the order of the model improves the model performance effectively, and the performance of IHDT-3 is much higher than that of IHDT-1 and IHDT-2. The high-order connection better captured the collaborative relationship between node messages, improved the node feature representation, and improved the model performance; when the model order continued to increase, there was a dramatic decrease in the model performance. The table shows that the model performance is lowest when the IHDT model order is 4. This may be due to overfitting caused by introducing noisy information into the representation learning that could be caused by applying an excessively deep architecture. Therefore, it is necessary to select the appropriate model order to determine the maximum performance of the model.

Dimensionality of user and item representation vectors

For the dimension selection of the final representation vector of users and items, the dimensions of the final representation vector were 32, 64, and 128. Table 4 shows the experimental results.

Based on the experimental data in Table 4 , the following findings are made: the IHDT indicators do not trend gradually upward with increasing dimensionality of the representation vector. The best performance of the proposed model was when the value of dimensionality was 64, indicating that the model needs sufficient dimensionality to encode the preferences of users or items. The model’s predictive ability decreases if the dimensionality is too low or too high.

Learning rate

The learning rate is a principal factor in ensuring the model’s performance. Too small a learning rate results in excessive convergence time or a failure to converge because the gradient disappeared; too large a learning rate puts the model at the risk of over-approximating the minimum and failing to converge. Therefore, this study used three learning rates, 1e−5, 3e−5, and 5e−5, to determine the sensitivity of this parameter. The experimental results are shown in Table 5 .

Table 5 shows that the model performed better when the learning rate was 1e−5. However, when the learning rate was 3e−5 or 5e−5, its performance decreased significantly, and although the training time was significantly reduced, the performance was poor.

L2 regularization factor

Deep learning models usually have superb prediction and fitting capabilities, and thus, they are prone to overfitting. Various regularization techniques can mitigate overfitting to some extent. We used L2 regularization to adjust the overfitting of the model. The experimental results are shown in Fig.  5 .

figure 5

IHDT model performance of Movielens-1M dataset under different regularization coefficients.

Figure  5 shows that the model achieved a better result when the regularization coefficient was 0.1. The model was robust in response to L2 regularization (i.e., the change in L2 regularization within a specific range did not significantly affect the model’s effectiveness), which also indirectly reflects the stability of the model.

Ablation experiments

The proposed model has two improvements over other dual tower recommendation models, based on constructing a heterogeneous recommendation base graph.

This paper proposes an interactive dual tower model; that is, when the dual tower model is constructed, the user (item)-related items (users) are injected into the user (item) features for interaction to improve the accuracy.

In this paper, we propose to improve the accuracy of the recommendation system through two design improvements. First, we use GNNs with higher-order learning mechanisms on the graph structure data, which shows a powerful feature extraction ability. Second, we converge the information features obtained from the dual tower model as the initial state of the GNN for higher-order fusion to improve the richness of user and item information. To verify the effectiveness and rationality of these two design improvements, we conducted corresponding ablation experiments and compared the difference in performance between the traditional and complete models. The specific completed comparative ablation experiments are as follows.

\({{\text{IHDT}}}_{noGCN}\) : removed the GCN module; it verifies the effect of the model only in the dual tower.

\({{\text{IHDT}}}_{noEV}\) : removed enhancement vector mechanism; it verifies the effect of the model using the traditional dual tower model.

The data used in the ablation experiment were the Movielens-1M dataset. Table 6 shows a comparison of the results of the improved model and the conventional model.

Table 6 shows that the two variants of the IHDT model algorithm show different degrees of effect degradation. The interaction-based dual tower model can obtain semantically richer user and item representations. The original dual tower model algorithm cannot obtain the historical interaction behavior of users, thus leading to poor accuracy. In contrast, the higher-order connectivity of GNNs can explore the importance of higher-order neighbors, assign them information on historical interaction behaviors, and enhance user representation and item representation, thus achieving improved accuracy of recommendation results.

Results overview

We comprehensively compared the IHDT algorithm with multiple mainstream recommendation algorithms on the public MovieLens data set. Our conclusions are as follows:

Accuracy and diversity. IHDT’s precision index (precision) and recall rate (recall) are at the first-tier level among all state-of-the-art models.

Response to long-tail demand. Judging from the hit rate indicator, the IHDT algorithm can better discover long-tail items and meet the needs of unpopular preferences. This benefits from the modeling of high-order feature relationships and the mining of implicit associations between users and items, thereby generating personalized recommendation results.

Robustness and adaptability. Parameter sensitivity experiments show that IHDT achieves relatively stable performance. The algorithm also performs well across datasets of different sizes. This makes it easy to migrate applications and is very practical.

In summary, the method of combining interactive twin towers with high-order feature representation proposed in this paper shows significant improvement in recommendation performance and application scalability.

Time cost analysis of introducing GCN high-order learning

The typical twin-tower model is mainly composed of a stack of fully connected layers. Due to the space-for-time strategy, it is very efficient in time. The time complexity of DNN can be expressed as \(O(\sum_{i=1}^{k}(U{D}_{i,in}{D}_{i,out}+I{D}_{i,in}{D}_{i,out})\) where U is the number of users, I is the number of items, \({D}_{i,in}{ and D}_{i,out}\) is the in/out representation dimension of layer i, and L is the number of fully connected layers. After introducing interactive learning and GCN, its time complexity increases to \(O(L*\left(U{D}^{2}+I{D}^{2}+ED\right))\) , where E is the interactive edge Number, D is the representation dimension. Usually, \(U{D}^{2}+I{D}^{2}\) and \(ED\) are in the same order of magnitude.

The paper proposes a new GNN-based recommendation system method, IHDT, that improves the accuracy of the traditional twin-tower recommendation model through interactive twin-tower structure and higher-order feature representation learning. Experimental results show that the IHDT algorithm can significantly improve the recommended accuracy compared with multiple competitive benchmarks.

Despite these advances, there are still several issues that require further study. First, more comprehensive work is needed to compare the computational efficiency of the IHDT method with other methods. Second, this method is currently evaluated only on the movie recommendation dataset; the application of IHDT for other types of recommendations (such as commodities and music) requires further research. Finally, further optimizing the choice of meta-path and user–project feature interaction is also an interesting research direction. Overall, the study provides valuable insights for improving the graph-based recommendation system, but more work needs to be done to promote the development of this field.

Data availability

The dataset file of this paper used is available at https://pan.baidu.com/s/1AEh-XNP_nJhxqrKuK0CgeA , with password: a2b4.

Cheng, P., Wang, S., Ma, J., Sun, J. & Xiong, H. Learning to recommend accurate and diverse items. In Proceedings of the 26th International Conference on World Wide Web . https://doi.org/10.1145/3038912.3052585 (2017).

Ahmedi, L., Rrmoku, K., Sylejmani, K. & Shabani, D. A bimodal social network analysis to recommend points of interest to tourists. Soc. Netw. Anal. Min. https://doi.org/10.1007/s13278-017-0431-8 (2017).

Article   Google Scholar  

Sarwar, B., Karypis, G., Konstan, J. & Riedl, J. Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th International Conference on World Wide Web. https://doi.org/10.1145/371920.372071 (2001).

Koren, Y., Bell, R. & Volinsky, C. Matrix factorization techniques for recommender systems. Computer (Long. Beach. Calif.) https://doi.org/10.1109/MC.2009.263 (2009).

Cai, B., Yang, X., Huang, Y., Li, H. & Sang, Q. A triangular personalized recommendation algorithm for improving diversity. Discrete Dyn. Nat. Soc. https://doi.org/10.1155/2018/3162068 (2018).

Article   MathSciNet   Google Scholar  

Cai, B., Zhu, X. & Qin, Y. Parameters optimization of hybrid strategy recommendation based on particle swarm algorithm. Expert Syst. Appl. https://doi.org/10.1016/j.eswa.2020.114388 (2021).

Article   PubMed   PubMed Central   Google Scholar  

Zhang, D., Yin, J., Zhu, X. & Zhang, C. Network representation learning: A survey. IEEE Trans. Big Data https://doi.org/10.1109/tbdata.2018.2850013 (2018).

Article   PubMed   Google Scholar  

Chen, C., Guo, J. & Song, B. Dual attention transfer in session-based recommendation with multi-dimensional integration. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval . https://doi.org/10.1145/3404835.3462866 (2021).

Batmaz, Z., Yurekli, A., Bilge, A. & Kaleli, C. A review on deep learning for recommender systems: challenges and remedies. Artif. Intell. Rev. https://doi.org/10.1007/s10462-018-9654-y (2019).

Chen, B. et al. TGCN: Tag graph convolutional network for tag-aware recommendation. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management . https://doi.org/10.1145/3340531.3411927 (2020).

Chen, W., Ren, P., Cai, F., Sun, F. & De Rijke, M. Improving end-to-end sequential recommendations with intent-aware diversification. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management . https://doi.org/10.1145/3340531.3411897 (2020).

Perozzi, B., Al-Rfou, R. & Skiena, S. DeepWalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . https://doi.org/10.1145/2623330.2623732 (2014).

Mikolov, T., Sutskever, I., Chen, K., Corrado, G. & Dean, J. Distributed Representations of Words and Phrases and their Compositionality . https://dl.acm.org/doi/10.5555/2999792.2999959 (2013).

Grover, A. & Leskovec, J. Node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . https://doi.org/10.1145/2939672.2939754 (2016).

Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J. & Mei, Q. LINE: Large-scale information network embedding. In Proceedings of the 24th International Conference on World Wide Web . https://doi.org/10.1145/2736277.2741093 (2015).

Hu, Z., Dong, Y., Wang, K., Chang, K. W. & Sun, Y. GPT-GNN: Generative pre-training of graph neural networks. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining . https://doi.org/10.1145/3394486.3403237 (2020).

Huang, K., Zhai, J., Zheng, Z., Yi, Y. & Shen, X. Understanding and bridging the gaps in current GNN performance optimizations. In Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming . https://doi.org/10.1145/3437801.3441585 (2021).

Unal, S., Çetin, M., Tavil, B., Çalişkan, N., Yetgin, S. & Uçkan, D. Graph convolutional matrix completion. Pediatr. Transplant. arXiv:1706.02263v2 (2017).

Wang, W. et al. Group-aware long-and short-term graph representation learning for sequential group recommendation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval . https://doi.org/10.1145/3397271.3401136 (2020).

He, X., He, Z., Du, X. & Chua, T. S. Adversarial personalized ranking for recommendation. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval . https://doi.org/10.1145/3209978.3209981 (2018).

Tang, J., Gao, H., Liu, H. & Das Sarmas, A. eTrust: Understanding trust evolution in an online world. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . https://doi.org/10.1145/2339530.2339574 (2012).

X. Wang, X. He, Y. Cao, M. Liu, and T. S. Chua, “KGAT: Knowledge graph attention network for recommendation. In  Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining 2019, doi: https://doi.org/10.1145/3292500.3330989 .

Wang, Y., Tang, S., Lei, Y., Song, W., Wang, S. & Zhang, M. DisenHAN: Disentangled heterogeneous graph attention network for recommendation. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management . https://doi.org/10.1145/3340531.3411996 (2020).

Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional networks. https://arxiv.org/abs/1609.02907 (2017).

Zhang, S., Yin, H., Chen, T., Hung, Q. V. N., Huang, Z. & Cui, L. GCN-based user representation learning for unifying robust recommendation and fraudster detection. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval . https://doi.org/10.1145/3397271.3401165 (2020).

Ying, R., He, R., Chen, K., Eksombatchai, P., Hamilton, W. L. & Leskovec, J. Graph convolutional neural networks for web-scale recommender systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining . https://doi.org/10.1145/3219819.3219890 (2018).

Sun, Y. & Han, J. Mining heterogeneous information networks: A structural analysis approach. SIGKDD Explor. 14 (2), 20–28 (2012).

Article   CAS   Google Scholar  

Shi, C., Li, Y., Zhang, J., Sun, Y. & Yu, P. S. A survey of heterogeneous information network analysis. IEEE Trans. Knowl. Data Eng. https://doi.org/10.1109/TKDE.2016.2598561 (2017).

Shi, C., Hu, B., Zhao, W. X. & Yu, P. S. Heterogeneous information network embedding for recommendation. IEEE Trans. Knowl. Data Eng. https://doi.org/10.1109/TKDE.2018.2833443 (2019).

Wang, X. et al. Heterogeneous graph attention network. In The World Wide Web Conference . https://doi.org/10.1145/3308558.3313562 (2019).

Dong, Y., Chawla, N. V. & Swami, A. Metapath2vec: Scalable representation learning for heterogeneous networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . https://doi.org/10.1145/3097983.3098036 (2017).

Fan, S. et al. Metapath-guided heterogeneous graph neural network for intent recommendation. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining . https://doi.org/10.1145/3292500.3330673 (2019).

Xu, Y., Zhu, Y., Shen, Y. & Yu, J. Learning shared vertex representation in heterogeneous graphs with convolutional networks for recommendation. In IJCAI . https://doi.org/10.24963/ijcai.2019/642 (2019).

Wang, C., Song, Y., Li, H., Zhang, M. & Han, J. Unsupervised meta-path selection for text similarity measure based on heterogeneous information networks. Data Min. Knowl. Discov. https://doi.org/10.1007/s10618-018-0581-y (2018).

Yang, C., Liu, M., He, F., Zhang, X., Peng, J. & Han, J. Similarity modeling on heterogeneous networks via automatic path discovery. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2018, Dublin, Ireland, September 10–14, 2018, Proceedings, Part II 18 . https://doi.org/10.1007/978-3-030-10928-8_3 (2019).

Liu, Z. et al. Interactive paths embedding for semantic proximity search on heterogeneous graphs. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining . https://doi.org/10.1145/3219819.3219953 (2018).

Liu, Z. et al. Distance-sware DAG embedding for proximity search on heterogeneous graphs. In Proceedings of the AAAI Conference on Artificial Intelligence . https://doi.org/10.1609/aaai . v32i1.11885 (2018).

Liu, Z. et al. Subgraph-augmented path embedding for semantic user search on heterogeneous social network. In Proceedings of the 2018 World Wide Web Conference . https://doi.org/10.1145/3178876.3186073 (2018).

Hu, B., Zhang, Z., Shi, C., Zhou, J., Li, X. & Qi, Y. Cash-out user detection based on attributed heterogeneous information network with a hierarchical attention mechanism. In Proceedings of the AAAI Conference on Artificial Intelligence . https://doi.org/10.1609/aaai.v33i01.3301946 (2019).

Covington, P., Adams, J. & Sargin, E. Deep neural networks for youtube recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems . https://doi.org/10.1145/2959100.2959190 (2016).

Luo, S., Lu, X., Wu, J. & Yuan, J. Review-aware neural recommendation with cross-modality mutual attention. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management . https://doi.org/10.1145/3459637.3482172 (2021).

Wang, J., Zhu, J. & He, X. Cross-batch negative sampling for training dual tower recommenders. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval . https://doi.org/10.1145/3404835.3463032 (2021).

Xin, S. et al. ATNN: Adversarial dual tower neural network for new item’s popularity prediction in e-commerce. In 2021 IEEE 37th International Conference on Data Engineering (ICDE) . https://doi.org/10.1109/ICDE51399.2021.00282 (2021).

Yang, J. et al. Mixed negative sampling for learning two-tower neural networks in recommendations. In Companion Proceedings of the Web Conference 2020 . https://doi.org/10.1145/3366424.3386195 (2020).

Sun, Y., Yu, Y. & Han, J. Ranking-based clustering of heterogeneous information networks with star network schema. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . https://doi.org/10.1145/1557019.1557107 (2009).

Wan, L. et al. Deep matrix factorization for trust-aware recommendation in social networks. IEEE Trans. Netw. Sci. Eng. https://doi.org/10.1109/TNSE.2020.3044035 (2017).

Ong, K., Ng, K. & Haw, S. Neural matrix factorization++ based recommendation system. F1000Research 10 , 1079 (2021).

Du, X. et al. Modeling embedding dimension correlations via convolutional neural collaborative filtering. ACM Trans. Inf. Syst. https://doi.org/10.1145/3357154 (2019).

Wang, X., He, X., Wang, M., Feng, F. & Chua, T. S. Neural graph collaborative filtering. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval . https://doi.org/10.1145/3331184.3331267 (2019).

He, X. et al . LightGCN: Simplifying and powering graph convolution network for recommendation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval 639–648. https://doi.org/10.1145/3397271.3401063 (2020).

Zhou, X., Zhou, H., Liu, Y., Zeng, Z., Miao, C., Wang, P., You, Y. & Jiang, F. (2023). Bootstrap latent representations for multi-modal recommendation. In Proceedings of the ACM Web Conference 845–854 (2023).

Download references

Acknowledgements

This work was partially funded by NSFC [Grant 61802034], the Digital Media Science Innovation Team of CDUT [Grant 10912-kytd201510] and the Yibin campus major construction and educational reform of CDUT [22100- 000047].

Author information

Authors and affiliations.

School of Mechanical and Electrical Engineering, Chengdu University of Technology, Chengdu, 610059, China

School of Computer Science and Cyber Security, Chengdu University of Technology, Chengdu, 610059, China

Xinkai Li & Biao Cai

College of Industrial Technology, Chengdu University of Technology, Yibin, 644000, China

You can also search for this author in PubMed   Google Scholar

Contributions

Q.H.: Conceptualization, Methodology, Formal analysis, Investigation, Resources. X.L.: Formal analysis, Investigation, Writing—original draft, Validation. B.C.: Conceptualization, Methodology, Project administration, Supervision, Investigation, Writing—review and editing, Resources.

Corresponding authors

Correspondence to Qiang He or Biao Cai .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

He, Q., Li, X. & Cai, B. Graph neural network recommendation algorithm based on improved dual tower model. Sci Rep 14 , 3853 (2024). https://doi.org/10.1038/s41598-024-54376-3

Download citation

Received : 20 July 2023

Accepted : 12 February 2024

Published : 15 February 2024

DOI : https://doi.org/10.1038/s41598-024-54376-3

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Recommendation
  • Dual tower model
  • Graph neural network
  • Collaborative filtering

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

graph representation learning for interactive biomolecule systems

graph representation learning for interactive biomolecule systems

Bingxin Zhou

graph representation learning for interactive biomolecule systems

Featured Co-authors

graph representation learning for interactive biomolecule systems

Minh-Ngoc Tran

Yu Guang Wang

Yiqing Shen

Xiaosheng Zhuang

Richard Gerlach

Xinliang Liu

Xuebin Zheng

LLQL: Logistic Likelihood Q-Learning for Reinforcement Learning

graph representation learning for interactive biomolecule systems

Multi-level Protein Representation Learning for Blind Mutational Effect Prediction

Graph representation learning for interactive biomolecule systems, framelet message passing, how gnns facilitate cnns in mining geometric information from large-scale medical images, decimated framelet system on graphs and fast g-framelet transforms, graph neural networks with haar transform-based convolution and pooling: a complete guide, adamt: a stochastic optimization with trend correction scheme, manifold optimisation assisted gaussian variational approximation.

Please sign up or login with your details

Generation Overview

AI Generator calls

AI Chat messages

Genius Mode messages

Genius Mode images

AD-free experience

Private images

  • Includes 500 AI Image generations, 1750 AI Chat Messages, 60 Genius Mode Messages and 60 Genius Mode Images per month. If you go over any of these limits, you will be charged an extra $5 for that group.
  • For example: if you go over 500 AI images, but stay within the limits for AI Chat and Genius Mode, you'll be charged $5 per additional 500 AI Image generations.
  • Includes 100 AI Image generations and 300 AI Chat Messages. If you go over any of these limits, you will have to pay as you go.
  • For example: if you go over 100 AI images, but stay within the limits for AI Chat, you'll have to reload on credits to generate more images. Choose from $5 - $1000. You'll only pay for what you use.

Out of credits

Refill your membership to continue using DeepAI

graph representation learning for interactive biomolecule systems

IMAGES

  1. Graph Representation Learning for Interactive Biomolecule Systems

    graph representation learning for interactive biomolecule systems

  2. The characterization of biomolecules and the corresponding deep

    graph representation learning for interactive biomolecule systems

  3. Biological Molecules Concept Map

    graph representation learning for interactive biomolecule systems

  4. Graph-based Molecular Representation Learning

    graph representation learning for interactive biomolecule systems

  5. Biomolecules

    graph representation learning for interactive biomolecule systems

  6. Illustration of biological graphs at twoscales: sub-molecular scale and

    graph representation learning for interactive biomolecule systems

VIDEO

  1. Graph Representation

  2. Day 22 #biologybooster Biomolecules

  3. Algo L19: Graph: representation using Adjacency Matrix and Adj List

  4. A Model of Text Enhanced Knowledge Graph Representation Learning With Mutual Attention

  5. Basics of Graph Representation and its Types by Dr. M Lakshmi Prasad

  6. Graph Self Supervised Learning With Application to Brain Networks Analysis

COMMENTS

  1. Graph Representation Learning for Interactive Biomolecule Systems

    Graph representation learning, in particular, is important for accurately capturing the geometric information of biomolecules at different levels. This paper presents a comprehensive review of the methodologies used to represent biological molecules and systems as computer-recognizable objects, such as sequences, graphs, and surfaces.

  2. Graph Representation Learning for Interactive Biomolecule Systems

    Graph representation learning, in particular, is important for accurately capturing the geometric information of biomolecules at different levels. This paper presents a comprehensive review...

  3. Graph Representation Learning for Interactive Biomolecule Systems

    Graph representation learning, in particular, is important for accurately capturing the geometric information of biomolecules at different levels. This paper presents a comprehensive review of the methodologies used to represent biological molecules and systems as computer-recognizable objects, such as sequences, graphs, and surfaces.

  4. Graph Representation Learning for Interactive Biomolecule Systems

    This paper examines how geometric deep learning models, with an emphasis on graph-based techniques, can analyze biomolecule data to enable drug discovery, protein characterization, and biological system analysis. Advances in deep learning models have revolutionized the study of biomolecule systems and their mechanisms. Graph representation learning, in particular, is important for accurately ...

  5. PDF a arXiv:2304.02656v1 [q-bio.QM] 5 Apr 2023

    Graph Representation Learning for Interactive Biomolecule Systems Xinye Xionga, Bingxin Zhoua,b,, Yu Guang Wanga,b aInstitute of Natural Sciences, Shanghai Jiao Tong University, Shanghai, 200240, China bShanghai National Center for Applied Mathematics (SJTU Center), Shanghai, 200240, China Abstract Advances in deep learning models have revolutionized the study of biomolecule

  6. Graph representation learning in biomedicine and healthcare

    35 Citations 229 Altmetric Metrics Abstract Networks—or graphs—are universal descriptors of systems of interacting elements. In biomedicine and healthcare, they can represent, for example,...

  7. Title: Graph Representation Learning for Interactive Biomolecule Systems

    Abstract: Advances in deep learning models have revolutionized the study of biomolecule systems and their mechanisms. Graph representation learning, in particular, is important for accurately capturing the geometric information of biomolecules at different levels. This paper presents a comprehensive review of the methodologies used to represent ...

  8. [2104.04883] Graph Representation Learning in Biomedicine

    Graph Representation Learning in Biomedicine Michelle M. Li, Kexin Huang, Marinka Zitnik Biomedical networks (or graphs) are universal descriptors for systems of interacting elements, from molecular interactions and disease co-morbidity to healthcare systems and scientific knowledge.

  9. Graph Representation Learning in Biomedicine and Healthcare

    Predominant paradigms in graph representation learning. (a) Shallow network embedding methods generate a dictionary of representations h u for every node u that preserves the input graph structure information. This is achieved by learning a mapping function f z that maps nodes into an embedding space such that nodes with similar graph neighborhoods measured by function f n get embedded closer ...

  10. ‪Bingxin Zhou‬

    Robust Graph Representation Learning for Local Corruption Recovery. B Zhou, Y Jiang, YG Wang, J Liang, J Gao, S Pan, X Zhang. ... Graph Representation Learning for Interactive Biomolecule Systems. X Xiong, B Zhou, YG Wang. arXiv preprint arXiv:2304.02656, 2023. 1: 2023:

  11. Equivariant Graph Neural Networks for 3D Macromolecular Structure

    TLDR. GCPNet, a new chirality-aware SE (3)-equivariant graph neural network designed for representation learning of 3D biomolecular graphs, is introduced, showing that GCPNet, unlike previous representation learning methods for 3D biomolecules, is widely applicable to a variety of invariant or equivariant node-level, edge-level, and graph-level ...

  12. Advancing Biomedicine with Graph Representation Learning ...

    06/18/23 - Graph representation learning (GRL) has emerged as a pivotal field that has contributed significantly to breakthroughs in various ... AI Chat Login. View Profile. AI Generators Pricing Glossary API Docs. ... Graph Representation Learning for Interactive Biomolecule Systems

  13. Examples of homogeneous and heterogeneous graphs

    Graph Representation Learning for Interactive Biomolecule Systems Preprint Full-text available Apr 2023 Xinye Xiong Bingxin Zhou Yu Guang Wang Advances in deep learning models have...

  14. Graph Representation Learning for Interactive Biomolecule Systems

    (DOI: 10.48550/arxiv.2304.02656) Advances in deep learning models have revolutionized the study of biomolecule systems and their mechanisms. Graph representation learning, in particular, is important for accurately capturing the geometric information of biomolecules at different levels. This paper presents a comprehensive review of the methodologies used to represent biological molecules and ...

  15. Deep Multi-attribute Graph Representation Learning on Protein

    The challenges include: 1) Proteins are complex macromolecules composed of thousands of atoms making them much harder to model than micro-molecules. 2) Capturing the long-range pairwise relations for protein structure modeling remains under-explored. 3) Few studies have focused on learning the different attributes of proteins together.

  16. DREAM: Drug-drug interaction extraction with enhanced dependency graph

    To cope with these problems, we propose to adopt the graph attention mechanism on an enhanced dependency graph to generate more accurate representations of drugs for DDI extraction tasks. Specifically, the bidirectional long short-term memory (BiLSTM) is adopted on the whole sentences to learn the representation of each word from the sequential ...

  17. Developments around Graph representation learning part5(Machine

    Graph Representation Learning for Interactive Biomolecule Systems(arXiv) Author : Xinye Xiong, Bingxin Zhou, Yu Guang WangXinye Xiong, Bingxin Zhou, Yu Guang Wang

  18. Graph neural network recommendation algorithm based on ...

    The IHDT model thus offers interactive modeling of user products and graph-based high-order feature learning, and as such provides new ideas for large-scale recommendation systems considering both ...

  19. A computational approach to drug repurposing using graph neural

    Drug repurposing is an approach to identify new medical indications of approved drugs.This work presents a graph neural network drug repurposing model, which we refer to as GDRnet, to efficiently screen a large database of approved drugs and predict the possible treatment for novel diseases. We pose drug repurposing as a link prediction problem in a multi-layered heterogeneous network with ...

  20. The characterization of biomolecules and the corresponding deep

    Graph representation learning, in particular, is important for accurately capturing the geometric information of biomolecules at different levels. This paper presents a... | Biomolecules,...

  21. Multi-scale self-attention mixup for graph classification

    The former decomposes graph representation aggregated by a k-layer GNN into low-pass and high-pass frequencies, and self-attention can capture internal dependencies between different scales. The latter mixes multi-scale self-attention graph representation information of two instances with a random convex weight to construct new samples. Datasets

  22. Bingxin Zhou

    Graph Representation Learning for Interactive Biomolecule Systems Advances in deep learning models have revolutionized the study of biomol... 0 Xinye Xiong, et al. ∙

  23. DREAM: Drug-Drug Interaction Extraction with Enhanced Dependency Graph

    Graph Representation Learning for Interactive Biomolecule Systems. Xinye Xiong Bingxin Zhou Yu Guang Wang. Computer Science, Biology. ArXiv. 2023; TLDR. This paper examines how geometric deep learning models, with an emphasis on graph-based techniques, can analyze biomolecule data to enable drug discovery, protein characterization, and ...