Questions about GraphConv layer

GreenTail · July 9, 2020, 8:11am

I am doing the tutorial “Graph Convolutions For Tox21”, but having a hard time with understanding the crucial parts of the algorithm. Here are the two main questions to which I can’t find the answers:

How GraphConv layer handle molecules of different structures as input (let’s say that were passed in the same mini batch)?
What are the trainable parameters of the GraphConv layer that are the same for two molecules of two different structures?

Regarding the first question:
I understand that each atom in a molecule has an assigned vector of features of the same length (equal to 75 maximum), but the number of atoms is different in each molecule (not even mentioning that each molecule has its own adjacency graph). So how inputs are handled?

Regarding the second question, I checked the source code for GraphConv and it seems that there could be only two trainable parameters W_list and b_list (please, correct me if I am wrong here). However, their size is governed by the parameters in_channels, out_channel, which seems to be different for two different atoms (please, correct me if I am wrong here too). So how it all works?

chstem · July 9, 2020, 9:44am

Regarding the first question: The input to the GraphConv model is a batch of atoms, together with information about the graph connecting the atoms. Multiple molecules in the same batch are supported as unconnected sub-graphs. The GraphConv layers only act on the atom level batches: it updates the features of an atom based on which other atoms it is connected to.
A molecule level batch is generated by the GraphGather layer, by summing/averaging over the features of all atoms of the molecule. This will output some kind of neural fingerprint for each molecule.

The batch_size parameter is the number of molecules in the batch and therefore the batch size of the output. The input batch size (on atom level) does not have a fixed size, since the number of atoms in a molecule varies. But of course it will be larger then the output batch size.

bharath · July 9, 2020, 5:15pm

Our apologies for the confusion! We’ve transition to a new docs site. The new docs are at https://deepchem.readthedocs.io/en/latest/models.html#graphconvmodel. Could you check to see if the new docs answer your second question?

We’ll take down the old docs shortly (have some AWS permissions issues we’re figuring out).

GreenTail · July 10, 2020, 5:14am

Thank you for the explanation. It clarifies a lot.

GreenTail · July 10, 2020, 5:35am

Thank you for the docs. They are very detailed, but perhaps I just miss something while reading them.

Do GraphConv and GraphGather layers have any trainable parameters (that are updated during training of a network)?

If I understood that correctly, in the original paper by Duvenaud et all. some random weight matrices used to produce neural fingerprints. So the fingerprints itself were produced by matrices with fixed (non updatable) values (please correct me if I am wrong here). Is it the same situation with GraphConv and GraphGather (that don’t have trainable parameters all, but together produce a neural fingerprint for each molecule)?

bharath · July 10, 2020, 4:56pm

GraphConv has trainable parameters (the weight matrices), but not GraphGather if I recall correctly. The Duvenaud paper did experiment with random weight matrices, but I believe their core results trained parameters in a a similar fashion as we do.