DeepChem Minutes 1/21/2021

DeepChem Minutes

India/Asia/Pacific Call

Date: January 21st, 2021

Attendees: Bharath, Mufei, Alana, Peter

Summary: Bharath this week made the release notes for DeepChem, announced the release publicly, and then worked on some post 2.4.0 cleanup and organization work. In particular, Bharath put up an issue to refresh the deepchem logo. Daiki did some excellent work automating our Docker release (PR), so we should be well positioned for the 2.5.0 release.

Mufei this week put together a PR making two submissions for the Clintox benchmark. For Chembl and clearance, we’d need to make fixes. MIchael suggested using the LogTransformer for clearance. For Chembl, Peter suggested just using the full dataset which may avoid the sparsity issues.

Alana has been working on building her understanding of the MARIA algorithm. She’s uploaded the mass spec datasets and is working on loading them into DeepChem. There is likely some filtering needed and figuring out how to implement the RNN.

Peter wasn’t able to work much on DeepChem this week.

Americas/Europe/Africa/Middle East

India/Asia/Pacific Call

Date: January 22nd, 2021

Attendees: Bharath, Nathan, (Seyone)

Summary: Nathan put up a new PR this week demonstrating how to use dc.dock to do protein ligand docking. Nathan plans to address Bharath and Peter’s feedback comments this coming week.

Nathan chatted with David from the Coley lab and mentioned that they might be potentially interested in collaborating with us. One potential idea might be to integrate more tightly with https://github.com/coleygroup/pyscreener and https://github.com/coleygroup/molpal.

Seyone was unable to attend today’s call due to scheduling conflicts with school, but sent over his update for the past week. He worked on adding more expanded explanations in the ChemBERTa Colab tutorial with LaTeX on transformers, attention, BERT, ChemBERTa, transfer learning, etc. He also added explanations on how the tokenizer regex works PR.

Besides the tutorial, Seyone has been hacking on building a RDKit physicochemical descriptor pre-training task for ChemBERTa. He has started working on base code for this but needs to figure out how to interface this with Hugging Face properly, and will likely reach out to them soon to discuss this.

Joining the DeepChem Developer Calls

As a quick reminder to anyone reading along, the DeepChem developer calls are open to the public! If you’re interested in attending either or both of the calls, please send an email to X.Y@gmail.com, where X=bharath, Y=ramsundar.