Google Summer of Code Ideas

DeepChem is part of Open Chemistry, which will be applying for Google summer of code this summer. We don’t yet know if we will be selected, but I wanted to start a thread up to brainstorm ideas for potential student projects:

  • JaxModel: DeepChem currently doesn’t have a good way to build models with Jax. This project would work to add a wrapper JaxModel in the style of KerasModel and TorchModel that allows for convenient wrapping of arbitrary Jax models in DeepChem. The project will involve implementing JaxModel, writing a suitable test suite, and putting together a good tutorial on how to use Jax with DeepChem as a jupyter notebook
  • Pytorch Lightning support: PyTorch lightning is a popular framework for PyTorch. This project would look into enabling the easy construction of PyTorch lightning based models for DeepChem
  • Paper Implementation: This project is more open ended. Pick a research paper that you like and implement it within DeepChem. For success in this project, you should reach out to me or other DeepChem community members for feedback and help in picking a suitable project.

If you have other suggestions for GSoC projects, please post them here!


Perhaps we could use GSoC as the opportunity to dedicate attention to the protein engineering side? Thinking of implementing papers like UniRep, or recent promising proteins + transformer work. Just a shot in the dark idea! :slight_smile:


To add on one idea, I’d love to see better support for semiconductor design in DeepChem. This new paper uses graph neural networks to work towards designing new semiconductors:

I’ve been writing a series of articles about the semiconductor industry which might be a good source of background information if you’re interested in this topic:

I’ve discussed this with a few people, but one idea would be to create a Model Hub like the one Hugging Face provides:
A scientific deep learning model hub would be very powerful I think. It could be “seeded” with a few of DeepChem’s most popular models and then users could upload new models and use them for transfer learning. This could be integrated with MoleculeNet later on as well, so models could be automatically evaluated across MoleculeNet tasks. It also leverages DeepChem’s positioning across multiple sciences, since I don’t really know of any other projects that are well-suited for this kind of hub.

The official wiki is up at I’ve ported over some of these ideas to the wiki!

This is a really cool idea but might be a little too complex to pull off within GSoC. The new GSoC program is shortened (about a month and change for students) and this project is probably too hard. I’m really interested in making this happen though!

