DeepChem Minutes 12/3/2020

bharath · December 10, 2020, 3:18am

DeepChem Minutes

India/Asia/Pacific Call

Date: December 3rd, 2020
Attendees: Bharath, Mufei, Sawood
Summary: Sawood is working as a solution architect, working on enterprise Java and other technologies. He’s recently learned Python and is interested to learn more about data science. Bharath mentioned the tutorials are a good resource and Sawood said he has been looking through them.

Bharath has been dealing with a personal situation this week so hasn’t been able to work much on DeepChem. Things are getting better so Bharath should be back to work next week.

Mufei has opened an initial PR 14 for a minimal viable product for the benchmark solutions for the MoleculeNet repo. Mufei saw Bharath’s comments and will respond. Mufei is also looking into setting up Github pages for the leaderboard. After this first PR is merged in, Mufei will open a new PR for a simple graph convolutional network. We can then scale to other datasets, and once we’re ready we can make an announcement online.

Americas/Europe/Africa/Middle East

Date: December 4th, 2020
Attendees: Bharath, Seyone, Vaijeyanthi, Hosein, Vignesh, Peter, Vasileios, James, Nathan
Summary: Bharath gave the same update as the previous time.

Vasileios works at ExScientia AI, a tech company that works on AI methods for drug discovery.

Seyone was busy this week with college applications, but put up a two part ChemBERTa tutorial PR which utilizes different tokenizers. Seyone got some good feedback from Peter on the tutorial regarding expanding on certain portions + concepts, such as how the tokenization strategies work, attention, transformers, and masked-language modeling, and has been revamping the first cut. He plans to continue updating them this weekend.

Vaijeyanthi has been working on learning through the DeepChem tutorials this week and has worked through 4 or so of the tutorials.

Hosein has been working on familiarizing himself with the DeepChem libraries. Hosein wants to contribute to adding generative models to DeepChem. Hosein is thinking of adding in functionality from the Moses library and will open an issue to coordinate.

Vignesh has been continuing to work on his PR for LCNNs. He ran into some errors with mypy when handling heterogenous key-value pair types. He’s handled this by using Any for now.

Peter has been continuing on the project of updating the tutorial sequence. He got up a PR for updating the reinforcement learning tutorial. Peter also put up a PR for hopefully fixing the pytorch geometric install commands.

James during the past week has been studying some graph neural networks and found a [bug](the output shape is not equal to the input in first dimension) on the embedding function for the GraphConv model. And the prediction function argument of output types defined are not clear, it looks like that predict function is equivalent to predict_embedding function if I change the output types. The implementation is a little complex under the hood and it’s difficult to figure out what’s happening.

Nathan made a few small fixes to the interaction fingerprint PR. That was the last component needed for the molecular docking tutorial so Nathan should be able to wrap that up soon and add it to the tutorial series. Nathan has been working more with OpenMM, MDTraj, and PDBFixer for handling PDB files. Nathan might make a new PDBLoader class in DeepChem to improve protein support.

Tyler is auditing this week.

James asked if every graph network in DeepChem has the same API. It’s a little confusing with the API.

Hosein mentioned the https://github.com/deepmind/jraph library is interesting to look at.

Joining the DeepChem Developer Calls

As a quick reminder to anyone reading along, the DeepChem developer calls are open to the public! If you’re interested in attending either or both of the calls, please send an email to X.Y@gmail.com, where X=bharath, Y=ramsundar.