Summary of GSoC 21: Make Deepchem more Robust by implementing cutting-edge models and other features

atreyamaj · August 22, 2021, 4:25pm

Hey everyone! I am Atreya. I participated in Google Summer of Code 2021 under Deepchem. The goal of my project was to primarily implement the Molecular Attention Transformer.

Main Contributions:

Added a new featurizer: MATFeaturizer
Added a new dataset to MoleculeNet: Freesolv
Added a way to implement and call deepchem layers written in pytorch (deepchem.models.torch_models.layers).
Implemented the Molecular Attention Transformer as a regression model for the freesolv dataset.

A quick and short summary of my progress can be found in the Deepchem forums page.

Here is a list of PRs I sent for the project that have been merged or are currently under review:

There are also a few other utility fix, documentation and pre-GSoC PRs which I have not listed above.

Future Work

I plan on continuing to work on optimizing my contributions and adding a few more features. Currently I plan on optimizing the MAT model by converting it to a more 2-D focussed implementation so that training speed can be increased.

The addition of D-MPNN is also something that I would really like to work on.

Acknowledgements

This was my first proper deep dive into open-source, and it would be an understatement to say that I loved it. @bharath @Vignesh @hanig were really helpful and supportive, and I learnt a great deal from them. I have been able to expand my knowledge of both Deep Learning and Chemistry. The mentorship provided to me by my mentors was amazing, and their feedback has helped me become a better and more mindful developer.

I would like to thank everyone involved in the project, and I hope we continue to work together!

atreyamaj · September 10, 2021, 12:10pm

Update: I’ve closed the previous PR for the MATModel.

Fresh PR link: https://github.com/deepchem/deepchem/pull/2691