DeepChem Minutes 4/24/2020

Date: 4/24/2020
Attendees: Bharath, Peter, Vignesh, Seyone
Summary: Bharath’s still continuing work on the examples PR. This PR has expanded over the last couple weeks for a number of reasons. The biggest one is that DeepChem’s support for structure-based models and featurizations is much weaker than its support for small-molecule based predictions. Bharath is working to flesh out structure handling alongside the example improvements. Peter suggested that once the PR is ready to merge in, it might be useful to break into simpler sub-PRs for ease of review.

Peter spent some time this week looking at benchmarking for the graph convolutional classes, but this is very early work still. Peter and Bharath discussed some ways to improve the CI support for DeepChem. It looks like the biggest need is Windows installation support for DeepChem. Peter suggested using a vagrant VM locally to test the build and then once the local windows build is functional, using CI. Azure Pipelines might be one option to handle Windows CI, although Travis CI also has alpha Windows support.

Seyone’s put up his PR with a tutorial for using RoBERTa style pretraining for molecular property prediction tasks (on tox21). He’s planning to expand out this technique for multi-task classification, and test it more thoroughly in the coming week’s. Bharath will review the PR over the next couple of days.

Vignesh is planning to resume work on his transfer learning WIP PR. This PR adds in a framework for doing chemception-style pretraining on arbitrary MoleculeNet datasets/models. This could usefully expand DeepChem’s transfer learning support to allow for the use of graph-convolutional pretraining or other interesting combinations. Bharath and Vignesh brainstormed a few ways to improve pretraining model support by storing pretrained models on AWS S3 which could then be pulled down by library functions. Seyone mentioned HuggingFace’s loading API, which allows for similar functionality and might be a useful design to reuse. Bharath and Vignesh also discussed some torchchem improvements which might happen over the coming month or two.

As a quick reminder to anyone reading along, the DeepChem developer calls are open to the public! If you’re interested in attending, please send an email to X.Y@gmail.com, where X=bharath, Y=ramsundar.

1 Like