DeepChem Minutes 1/28/2021

bharath · February 4, 2021, 12:50am

DeepChem Minutes

India/Asia/Pacific Call

Date: January 28th, 2021
Attendees: Bharath, Alana, Peter
Summary: Bharath has been working on revamping the AtomicConvModel which has fallen a bit by the wayside. This model currently looks to have been broken in the TensorFlow 1 -> TensorFlow 2 transition and needs to be fixed up. Bharath has been experimenting with simplifying the model architecture to just use fully connected layers at the end.

Peter suggested that it might be worth just looking at SchNet, which seems to have become the standard architecture in the field for 3D models. Bharath agreed and said it might be a good next model to add on.

Peter has been working on updating the book examples to update to DeepChem 2.4.0. There’s a lot of little things Peter is finding. In particular, Peter has found some serious problems in RDkitGridFeaturizer

Alana asked if we could have a page up that lists which features are WIP and less stable. Perhaps a noticeboard on the documentation page. Peter agreed this might be a useful idea and Alana volunteered to get this started.

Alana has been working on getting the MARIA work moving which is challenging because it takes a while to understand the layout of the codebase. Alana also reached out to Dr. Emma Pierson about her recent paper who was open to adding a MIT license to the codebase. The code has a large image processing module, and training and analysis code.

Americas/Europe/Africa/Middle East

India/Asia/Pacific Call

Date: January 29th, 2021
Attendees: Bharath, Seyone, Stanley, Nathan, David
Summary: David is a 3rd year PhD group at Harvard/MIT who’s working on computational drug discovery.

Bharath gave the same update as at the previous meeting.

Seyone has been working on updating the ChemBERTa tutorial. Based on feedback from Peter, Seyone is working to merge the two tutorial ipython notebooks in the WIP PR and condense it down into one tutorial so it’s easier to read. Seyone has also been working on basic code for adding rdkit descriptor pretraining to ChemBERTa.

Stanley has been working with Keras Tuner for his company’s hyperparameter tuner. This weekend, Stanley might take a crack at Keras Tuner integration with DeepChem. Bharath suggested that this might be a nice tutorial on advanced hyperparameter tuning.

Nathan has just merged in his tutorial on protein-ligand docking (link). Nathan has also been chatting with the developers of gnina and has gotten a build of Gnina working on Google Colab with their help.

David has been working on two repos, molpal and pyscreener which respectively allow for active learning and running virtual screening/docking with python calls. pyscreener allows for a python wrapper around docking libraries and also builds out integration with distributed computation settings. David is interested to potentially integrate the pyscreener infrastructure into DeepChem.

Stanley asked about the underlying infrastructure for pyscreener. David mentioned that currently, mpi4py provides the infrastructure but he is exploring swapping over to ray as more robust infrastructure. Stanley mentioned he had good experience with Dask and suggested it might possibly also be useful.

Moving to general discussion, Bharath mentioned that he wants to revive the https://github.com/deepchem/deepchem-gui project, which created a GUI frontend for DeepChem and asked if anyone knew of good javascript developers who might be interested in helping. Stanley mentioned that he might know someone who would be interested.

Bharath also mentioned Alana’s new issue https://github.com/deepchem/deepchem/issues/2376 tracking known pain points in DeepChem and asked everyone to contribute on.

Joining the DeepChem Developer Calls

As a quick reminder to anyone reading along, the DeepChem developer calls are open to the public! If you’re interested in attending either or both of the calls, please send an email to X.Y@gmail.com, where X=bharath, Y=ramsundar.