Attendees: Bharath, Peter, Vignesh
Summary: Bharath started by discussing his on-going work on the design document for DeepChem 3.0. Currently he’s thinking about doing a two-way split into a MoleculeNet and DeepChem models package. The subpackages for metalearning, reinforcement learning should likely be split out and possibly removed if there’s a good alternative system already available.
Peter talked about his progress updating DeepChem to TensorFlow 2.X. So far, there aren’t any major surprises. Many of the older keras models just work in the upgrade. However, there have been many small changes that need to be handled. Some of these are causing tricky issues which will need to be hunted down. For more details, see the discussion in the PR. There are also a few classes which have not been converted from TensorGraph to Keras. We’ve tentatively decided not to convert these classes. It looks like they’re either not being used or in some cases there are open source alternatives already available. If you’d like to see these classes stay, please say so in the PR discussion!
We then started a discussion of relevant open source chemistry and data science tooling which has emerged in the last few years. An incomplete list includes:
- https://github.com/ericmjl/pyjanitor: Clean APIs for data cleaning. Python implementation of R package Janitor
- https://github.com/aiqm/torchani: Accurate Neural Network Potential on PyTorch
- https://github.com/atomistic-machine-learning/schnetpack: SchNetPack - Deep Neural Networks for Atomistic Systems
- https://github.com/wengong-jin/chemprop: Chemical Property Prediction with Graph Convolutional Networks
- https://github.com/MMunibas/PhysNet: PhysNet: A Neural Network for Predicting Energies, Forces, Dipole Moments and Partial Charges
- https://github.com/molecularsets/moses: Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models
The molecular machine learning ecosystem is now considerably more mature than it was when DeepChem first started. Bharath is planning to do some research on these packages as part of writing the DeepChem 3.0 design doc to understand what functionality these other packages already provide. To the degree possible, we should add new value and avoid reinventing the wheel when unnecessary.
Vignesh has previously suggested that a Pytorch-geometric version of DeepChem could add a lot of value. Bharath asked Vignesh what features would be interesting in a pytorch DeepChem package. One goal of the DeepChem 3.0 redesign should be to enable new cutting edge research. Vignesh is currently busy with thesis work but will take a crack at putting together a simple document with specifications for a pytorch-Geometric DeepChem package over the coming weeks. This might double as his Google summer of code proposal.
It came up in our discussion that DeepChem hasn’t been used in cutting edge research in some time. Ideally, DeepChem 3.0 should serve as a platform for useful novel research. Bharath is planning to brainstorm a few potential projects that DeepChem 3.0 could help bootstrap.