Date: July 1st, 2021
Attendees: Bharath, Alana, Arun
Summary: Bharath has been working on GSoC work primarily.
Alana has been working the changes for FASTALoader ready and has been trying to get the PR ready for merge. Alana is currently working on sharding support.
Arun has been working on adding documentation to molecular featurizers but is running into some pymatgen vs tensorflow issues on numpy 1.20.
Bharath worked with Alana and Arun to review their PRs
Date: June 25th, 2021
Attendees: Bharath, Atreya, Ashwin, Peter, Vignesh, James, Stanley, Seyone
Summary: Bharath gave the same update as at the last meeting.
MATFeaturizer has been merged in and is working on a moleculenet loader for the dataset he needs.
DummyFeaturizer has been merged in and is working to get the
USPTOLoader merged in.
Vignesh’s deepchem[torch] build environment has been merged in and is working on the deepchem[tensorflow] build environment. Vignesh plans to return to JaxModel work.
Seyone got a fix to
SmilesTokenizer merged in and is working on getting the
RobertaTokenizer merged in.
Peter has been reviewing issues on Github but was busy otherwise.
James has been busy with other work.
Stanley was a bit busy this week but had a chance to take a look at the
Bharath raised the question about how we should deal with
Trainer style APIs in DeepChem. HuggingFace uses https://huggingface.co/transformers/_modules/transformers/trainer.html#Trainer to implement a common trainer API. Seyone mentioned that features like https://github.com/seyonechithrananda/bert-loves-chemistry/blob/master/chemberta/utils/data_collators.py data collation and FP16 support are new to
Trainer. Peter mentioned these could be potentially handled by
Joining the DeepChem Developer Calls
As a quick reminder to anyone reading along, the DeepChem developer calls are open to the public! If you’re interested in attending either or both of the calls, please send an email to X.Y@gmail.com, where X=bharath, Y=ramsundar.