DeepChem Minutes 4/29/2021

DeepChem Minutes

India/Asia/Pacific Call

Date: April 22nd, 2021
Attendees: Vignesh, Samyak, Ashwin, Victor, Prabin
Summary: Victor is a software engineer who’s worked on front-end and backend technologies who is interested in learning more about machine learning.

Prabin is a machine learning engineer at a startup working on computer vision and NLP.

Bharath this week put up an onboarding document.

Vignesh had a slow week this week, but started looking into joblib for parallelizing models. Bharath mentioned that there was partial joblib support for Featurizer at one point but it was removed.

Samyak is exploring Pytorch Lightning support for DeepChem.

Ashwin was experimenting with HuggingFace models on the USPTO dataset. Ashwin tried to load the dataset USPTO15K, but there was an issue where pandas seemed to skip lines with incorrect number of entries. This might require cleaning up the dataset.

Americas/Europe/Africa/Middle East

India/Asia/Pacific Call

Date: April 23rd, 2021
Attendees: Peter, David, Omer, Nathan, Seyone
Summary: Omer is an engineer from Turkey who encountered the Deep Learning for the Life sciences book. Douglas is a senior medicinal chemist with the Gates foundation who works on AI driven drug design.

Peter is just about finished with the updates to DeepChem. Unfortunately one of the Chapter 11 examples has been causing issues. The error is documented at this issue and the book repo is at https://github.com/deepchem/DeepLearningLifeSciences. The actual issue is in

https://github.com/deepchem/DeepLearningLifeSciences/blob/master/Chapter11/chapter_11_02_erk2_graph_conv.ipynb.

David is working on a tutorial for protein melting point tutorial. He will update with new changes in the next week and has access to a subset of Uniprot.

Nathan has been working on MoleculeNet this week and did a documentation update (https://github.com/deepchem/deepchem/pull/2503) to contribute datasets using Peter’s new API and wrote a short tutorial for the MoleculeNet repo (https://github.com/deepchem/moleculenet/pull/39). Nathan hopes to have a blog post written up to accompany the tutorial.

Seyone this week has been working on building up a couple page description on how to integrate ChemBERTa models into DeepChem. Walid is working on parts of this integration as well and Seyone is thinking about bringing datasets and other tools into DeepChem. Bharath suggested that we should try to get ChemBERTa into MoleculeNet.

Stanley has started running the first models on the anyscale cloud and liking the technology and has really enjoyed the interaction. He’s also gotten the thumbs-up from his team to share some more Ray code with DeepChem.

Bharath has been workin on a new DeepChem Onboarding doc which should help new developers start with DeepChem. Bharath mentioned that the discussion about DeepChem’s future directions might be a good doc to read for newcomers to understand project directions.

Joining the DeepChem Developer Calls

As a quick reminder to anyone reading along, the DeepChem developer calls are open to the public! If you’re interested in attending either or both of the calls, please send an email to X.Y@gmail.com, where X=bharath, Y=ramsundar.