Date: 5/15/2020
Attendees: Bharath, Seyone, Vignesh, Peter
Summary: As a procedural note, Bharath asked if we could move the DeepChem weekly calls to 3pm PST. Our new Google Summer of Code student is based in Japan so our current 12:45 PST times are very tricky for him to make. Bharath suggested a new time of 3pm PST which everyone seems to be able to make so we’ll shift the calls to then from next week onwards. Bharath noted that Daiki has posted a project introduction on the forums and a project roadmap on Github.
Bharath has continued work on the refactoring PR PR. He merged in the pretty print for Datasets PR that was split out and factored out another small PR of coordinate box utils. He’s been working on cleaning up the design of the docking module and has most of it now refactored. He’s cleaning up that and hopes to send in PRs for them over the next week. Bharath noted that the earlier docking module was poorly designed and didn’t have a clean API so this will involve a major breaking change to reach a sensible API. Seyone mentioned he’s interested in working with docking so he’d be interested to play with the new API once it’s designed.
Peter continued working on getting the slow tests passing again in this PR. Almost all the slow tests are now functional, with the exception of those from the docking submodule. Once the docking changes are upstreamed, we should be able to merge in these changes. Peter also took some time to try to work more seriously on speeding up the graph convolutions. It looks like a lot of our overhead was from handling the ConvMol
objects and other pre-TensorFlow data munging. He hopes to have a PR up soon that improves some of these issues.
Vignesh is currently working on a Neurips deadline so will be busy till June, but hopes to have more time to work on DeepChem development afterwards.
Seyone has been working on getting his ChemBerta implementation working on multitask models. He hopes to benchmark it soon on Tox21 to see whether there are transfer learning boosts. He’s also working on pretraining on a larger subset of Zinc to see if that results in improved predictive power for the models.
Bharath mentioned as a parting note that it looks like Amazon made a SageMaker tutorial for using DeepChem. However, it looks like there are a couple issues with the tutorial as noted in this issue. He asked if anyone had used Sagemaker before? Unfortunately it looks like no one has experience, but Seyone said he’d be interested in giving it a try to run some experiments. Bharath and Seyone will coordinate offline about experimenting with Sagemaker.
As a quick reminder to anyone reading along, the DeepChem developer calls are open to the public! If you’re interested in attending, please send an email to X.Y@gmail.com, where X=bharath, Y=ramsundar.