Paper: Bechmarking Graph Neural Networks


Paper -> (Preprint!!)

Probably many of you have read the paper already, but I still thought it might be interesting for people who missed it.
It compares many different Graph Networks and summarizes the effects of recent “advancements”.

I find particularly interesting that it sort of supports the the findings by:

which argue/show that the GCN introduced by Kipf and Welling is limited in its ability to learn sophisticated graph structures and fails to distinguish simple graphs.
While the original GCN was known to not provide the best results, I still think it is a model that remains highly influential and is still being used for graph-like structures.

The ZINC dataset they used could also be interesting for MoleculeNet/DeepChem. To my knowledge this dataset is not currently included here.

And maybe if their is a updated MolecularNet bechmarking planned, models such as gated–GNN or the GIN could be included as well.
Anyways I think it is a really interesting read, especially for people who do not follow the development of Graph Neural Networks.


Thanks for sharing this paper! This is a really interesting suite of benchmarks. The ZINC benchmark they use seems to be for a constrained solubility task which would be useful to add to MoleculeNet.

For the next major update of MoleculeNet, it would definitely be worthwhile to benchmark the latest Graph Convolutional architectures. I’ve been playing a bit with PyTorch Geometric and have DGL on my list to look at too.

It would be really interesting to see how much progress has been made with newer GNN architectures.

I think the great thing about PyTorch Geometric and DGL is that they provide a consistent way of storing and manipulating graph data. So you do not need to rely on your own wonky ways to deal graph structures.

1 Like

While we are at the topic this paper also makes an attempt to improve the way GNNs are benchmarked.

1 Like

Is there any overlap between the models they test and the ones in DeepChem? It would be interesting to see how our models do. Especially since MPNNs were designed to be a general framework that includes lots of other published models as special cases.

In any case, I think there’s a lot of room for improvement in DeepChem’s graph models. Most of them don’t even let you control the number of layers. And allowing for residual layers would be very useful.

1 Like

I checked today, and I could not find any overlap.

While they compare for example different Variations of the Graph Convolution Networks. None of them use the one proposed by Duvenaud et. al. which is the one that Molecularnet is using. Same is true for the Gilmer et. al.s MPNN.

For the other Networks such as the Weave models I could not find anything.

Seconding @peastman’s point, I definitely think that there’s a lot of room for improvement on the graph models. It would be useful to be able to control the number of layers and allow for residual layers

One other axis for improvement is speed. Here’s a table from DGL’s repo benchmarking against the DeepChem graph convs (

Screen Shot 2020-04-21 at 10.40.50 AM

It claims a 5.5x speed improvement for DGL graph convs (implemented in PyTorch) over the DeepChem graph convs. It would be really useful if we could speed up our graph convs to be as fast as theirs. Unfortunately their codebase is PyTorch so we can’t just call the DGL graph convs under the hood from DeepChem, but maybe we can adapt some of their tricks to Keras to get speedups.

1 Like

It would not surprise me at all to learn the implementation in DeepChem isn’t done in the most efficient way! We should look at how the DGL version is implemented and whether that gives ideas for how to speed up the one in DeepChem.


Here’s another GNN comparison paper (ICLR2020) that is somewhat pessimistic about GNNs being a significant improvement over baselines.

1 Like

Hi, yes I saw that Paper as well.

In the I initially posted , they concluded that the TU Datassets, which include ENZYMES, DD and PROTEINS, are suitable to evaluate GNN.

By the way the paper was updated yesterday, so the analysis of ENZYMES, DD and PROTEINS is now in the supplementary materials. Beyond that they introduced new methods for improving GNN perform.
I find the usage of Laplacian Eigenvectors as Positional Encodings really interesting. It appears to be a good and easy option to increase expressiveness in GNNs.


Interesting, another example of the parallel between attention and “soft adjacency matrix”.

1 Like

While we are at Expressivity, this paper was uploaded yesterday. I did not have time yet for a good read. But it seems also quite intersting.

1 Like