Google AI Weblog: GraphWorld: Advances in Graph Benchmarking


Graphs are quite common representations of pure programs which have linked relational elements, akin to social networks, site visitors infrastructure, molecules, and the web. Graph neural networks (GNNs) are highly effective machine studying (ML) fashions for graphs that leverage their inherent connections to include context into predictions about gadgets inside the graph or the graph as a complete. GNNs have been successfully used to uncover new medication, assist mathematicians show theorems, detect misinformation, and enhance the accuracy of arrival time predictions in Google Maps.

A surge of curiosity in GNNs over the past decade has produced hundreds of GNN variants, with a whole lot launched every year. In distinction, strategies and datasets for evaluating GNNs have obtained far much less consideration. Many GNN papers re-use the identical 5–10 benchmark datasets, most of that are constructed from simply labeled tutorial quotation networks and molecular datasets. Because of this the empirical efficiency of recent GNN variants could be claimed just for a restricted class of graphs. Confounding this concern are not too long ago revealed works with rigorous experimental designs that solid doubt on the efficiency rankings of in style GNN fashions reported in seminal papers.

Latest workshops and convention tracks dedicated to GNN benchmarking have begun addressing these points. The recently-introduced Open Graph Benchmark (OGB) is an open-source bundle for benchmarking GNNs on a handful of massive-scale graph datasets throughout a wide range of duties, facilitating constant GNN experimental design. Nonetheless, the OGB datasets are sourced from most of the similar domains as present datasets, akin to quotation and molecular networks. Because of this OGB doesn’t remedy the dataset selection drawback we point out above. Subsequently, we ask: how can the GNN analysis group sustain with innovation by experimenting on graphs with the big statistical variance seen within the real-world?

To match the dimensions and tempo of GNN analysis, in “GraphWorld: Pretend Graphs Carry Actual Insights for GNNs”, we introduce a strategy for analyzing the efficiency of GNN architectures on tens of millions of artificial benchmark datasets. Whereas GNN benchmark datasets featured in tutorial literature are simply particular person “places” on a fully-diverse “world” of potential graphs, GraphWorld immediately generates this world utilizing chance fashions, exams GNN fashions at each location on it, and extracts generalizable insights from the outcomes. We suggest GraphWorld as a complementary GNN benchmark that permits researchers to discover GNN efficiency on areas of graph area that aren’t lined by in style tutorial datasets. Moreover, GraphWorld is cost-effective, operating hundreds-of-thousands of GNN experiments on artificial information with much less computational price than one experiment on a big OGB dataset.

Illustration of the GraphWorld pipeline. The consumer offers configurations for the graph generator and the GNN fashions to check. GraphWorld spawns employees, every one simulating a brand new graph with numerous properties and testing all specified GNN fashions. The check metrics from the employees are then aggregated and saved for the consumer.

The Restricted Number of GNN Benchmark Datasets
As an instance the motivation for GraphWorld, we examine OGB graphs to a a lot bigger assortment (5,000+) of graphs from the Community Repository. Whereas the overwhelming majority of Community Repository graphs are unlabelled, and due to this fact can’t be utilized in widespread GNN experiments, they signify a big area of graphs which are out there in the true world. We computed two properties of the OGB and Community Repository graphs: the clustering coefficient (how interconnected nodes are to close by neighbors) and the diploma distribution gini coefficient (the inequality among the many nodes’ connection counts). We discovered that OGB datasets exist in a restricted and sparsely-populated area of this metric area.

The distribution of graphs from the Open Graph Benchmark doesn’t match the bigger inhabitants of graphs from the Community Repository.

Dataset Mills in GraphWorld
A researcher utilizing GraphWorld to analyze GNN efficiency on a given process first chooses a parameterized generator (instance beneath) that may produce graph datasets for stress-testing GNN fashions on the duty. A generator parameter is an enter that controls high-level options of the output dataset. GraphWorld makes use of parameterized turbines to supply populations of graph datasets which are diverse sufficient to check the bounds of state-of-the-art GNN fashions.

As an example, a well-liked process for GNNs is node classification, through which a GNN is educated to deduce node labels that signify some unknown property of every node, akin to consumer pursuits in a social community. In our paper, we selected the well-known stochastic block mannequin (SBM) to generate datasets for this process. The SBM first organizes a pre-set variety of nodes into teams or “clusters“, which function node labels to be categorized. It then generates connections between nodes in response to numerous parameters that (every) management a distinct property of the ensuing graph.

One SBM parameter that we expose to GraphWorld is the “homophily” of the clusters, which controls the probability that two nodes from the identical cluster are linked (relative to 2 nodes from completely different clusters). Homophily is a typical phenomenon in social networks through which customers with comparable pursuits (e.g., the SBM clusters) usually tend to join. Nonetheless, not all social networks have the identical degree of homophily. GraphWorld makes use of the SBM to generate graphs with excessive homophily (beneath on the left), graphs with low homophily (beneath on the correct), and tens of millions extra graphs with any degree of homophily in-between. This permits a consumer to investigate GNN efficiency on graphs with all ranges of homophily with out relying on the supply of real-world datasets curated by different researchers.

Examples of graphs produced by GraphWorld utilizing the stochastic block mannequin. The left graph has excessive homophily amongst node courses (represented by completely different colours); the proper graph has low homophily.

GraphWorld Experiments and Insights
Given a process and parameterized generator for that process, GraphWorld makes use of parallel computing (e.g., Google Cloud Platform Dataflow) to supply a world of GNN benchmark datasets by sampling the generator parameter values. Concurrently, GraphWorld exams an arbitrary record of GNN fashions (chosen by the consumer, e.g., GCN, GAT, GraphSAGE) on every dataset, after which outputs an enormous tabular dataset becoming a member of graph properties with the GNN efficiency outcomes.

In our paper, we describe GraphWorld pipelines for node classification, hyperlink prediction, and graph classification duties, every that includes completely different dataset turbines. We discovered that every pipeline took much less time and computational assets than state-of-the-art experiments on OGB graphs, which implies that GraphWorld is accessible to researchers with low budgets.

The animation beneath visualizes GNN efficiency information from the GraphWorld node classification pipeline (utilizing the SBM because the dataset generator). As an instance the influence of GraphWorld, we first map basic tutorial graph datasets to an xy airplane that measures the cluster homophily (x-axis) and the typical of the node levels (y-axis) inside every graph (just like the scatterplot above that features the OGB datasets, however with completely different measurements). Then, we map every simulated graph dataset from GraphWorld to the identical airplane, and add a 3rd z-axis that measures GNN mannequin efficiency over every dataset. Particularly, for a specific GNN mannequin (like GCN or GAT), the z-axis measures the imply reciprocal rank of the mannequin in opposition to the 13 different GNN fashions evaluated in our paper, the place a price nearer to 1 means the mannequin is nearer to being the highest performer by way of node classification accuracy.

The animation illustrates two associated conclusions. First, GraphWorld generates areas of graph datasets that stretch well-beyond the areas lined by the usual datasets. Second, and most significantly, the rankings of GNN fashions change when graphs grow to be dissimilar from tutorial benchmark graphs. Particularly, the homophily of basic datasets like Cora and CiteSeer are excessive, that means that nodes are well-separated within the graph in response to their courses. We discover that as GNNs traverse towards the area of less-homophilous graphs, their rankings change shortly. For instance, the comparative imply reciprocal rank of GCN strikes from greater (inexperienced) values within the tutorial benchmark area to decrease (crimson) values away from that area. This reveals that GraphWorld has the potential to disclose important headroom in GNN structure growth that might be invisible with solely the handful of particular person datasets that tutorial benchmarks present.

Relative efficiency outcomes of three GNN variants (GCN, APPNP, FiLM) throughout 50,000 distinct node classification datasets. We discover that tutorial GNN benchmark datasets exist in GraphWorld areas the place mannequin rankings don’t change. GraphWorld can uncover beforehand unexplored graphs that reveal new insights about GNN architectures.

GraphWorld breaks new floor in GNN experimentation by permitting researchers to scalably check new fashions on a high-dimensional floor of graph datasets. This permits fine-grained evaluation of GNN architectures in opposition to graph properties on whole subspaces of graphs which are distal from Cora-like graphs and people within the OGB, which seem solely as particular person factors in a GraphWorld dataset. A key characteristic of GraphWorld is its low price, which allows particular person researchers with out entry to institutional assets to shortly perceive the empirical efficiency of recent fashions.

With GraphWorld, researchers may also examine novel random/generative graph fashions for more-nuanced GNN experimentation, and probably use GraphWorld datasets for GNN pre-training. We sit up for supporting these strains of inquiry with our open-source GraphWorld repository and follow-up tasks.

GraphWorld is joint work with Brandon Mayer and Bryan Perozzi from Google Analysis. Due to Tom Small for visualizations.


Please enter your comment!
Please enter your name here