I’m starting up this thread to brainstorm potential ideas for GSoC 2022 projects. (Note that we have not yet been selected and won’t know if we have till later!)
Edit: We have been selected for GSOC as part of Open Chemistry! https://summerofcode.withgoogle.com/programs/2022/organizations/open-chemistry
- Layer Documentation: DeepChem is moving towards a concept of first class layers. Improving the documentation for existing layers will help us make our current collection of layers more useful for the community. This project should also add a tutorial for using the layers to the DeepChem tutorial series, and should plan to add a few new layers as well.
- PyTorch Porting: DeepChem is shifting towards using PyTorch as its primary backend, but many models are still implemented in TensorFlow. A good project could be to pick a TensorFlow model or two, then port its layers and model into PyTorch along with suitable unit tests. See https://github.com/deepchem/deepchem/issues/2863
- HuggingFace Integration: Last year, we had a few student projects explore HuggingFace/DeepChem integration, but these projects were not able to merge in HuggingFace models into DeepChem. This project would create a working HuggingFace model in DeepChem along with tutorials on how to use HuggingFace with DeepChem.
- Implement a Wishlist Model: DeepChem has an extensive wishlist of models (https://github.com/deepchem/deepchem/issues/2680). Pick a model from the wishlist and implement it in DeepCHem.
- Improving our PINNs Support: One of the exciting new features in DeepChem 2.6.0 is support for PINNs, a class of techniques to solve PDEs with neural networks. The API for this class is still rudimentary and supports only a limited class of models and requires handcoding the loss. Extend the API to allow for a broader class of PDEs to be implemented. I’d suggest using Schrodinger’s equation as a test since Schrodinger can be solved in 1D as a toy and extended to arbitrarily high dimensions for larger molecules.
- Improve Equivariant Support: DeepChem has no support for equivariant models. Given the increasing importance of equivariance for scientific machine learning this is a major oversight. This project would aim to add a tutorial about equivariant modeling and ideally add an equivariant model to DeepChem.
- Improving Antibody Support: DeepChem at present doesn’t have much tooling or support for working with anbtibodies. This project would add suitable antibody datasets to MoleculeNet and create a tutorial walking users through antibody design and modeling with DeepChem. If necessary, students may add antibody-specific models as well.
Community members, please add on more suggestions!