Brainstorming GSoC 2022 Topics

I’m starting up this thread to brainstorm potential ideas for GSoC 2022 projects. (Note that we have not yet been selected and won’t know if we have till later!)

Edit: We have been selected for GSOC as part of Open Chemistry! https://summerofcode.withgoogle.com/programs/2022/organizations/open-chemistry

  • Layer Documentation: DeepChem is moving towards a concept of first class layers. Improving the documentation for existing layers will help us make our current collection of layers more useful for the community. This project should also add a tutorial for using the layers to the DeepChem tutorial series, and should plan to add a few new layers as well.
  • PyTorch Porting: DeepChem is shifting towards using PyTorch as its primary backend, but many models are still implemented in TensorFlow. A good project could be to pick a TensorFlow model or two, then port its layers and model into PyTorch along with suitable unit tests. See https://github.com/deepchem/deepchem/issues/2863
  • HuggingFace Integration: Last year, we had a few student projects explore HuggingFace/DeepChem integration, but these projects were not able to merge in HuggingFace models into DeepChem. This project would create a working HuggingFace model in DeepChem along with tutorials on how to use HuggingFace with DeepChem.
  • Implement a Wishlist Model: DeepChem has an extensive wishlist of models (https://github.com/deepchem/deepchem/issues/2680). Pick a model from the wishlist and implement it in DeepCHem.
  • Improving our PINNs Support: One of the exciting new features in DeepChem 2.6.0 is support for PINNs, a class of techniques to solve PDEs with neural networks. The API for this class is still rudimentary and supports only a limited class of models and requires handcoding the loss. Extend the API to allow for a broader class of PDEs to be implemented. I’d suggest using Schrodinger’s equation as a test since Schrodinger can be solved in 1D as a toy and extended to arbitrarily high dimensions for larger molecules.
  • Improve Equivariant Support: DeepChem has no support for equivariant models. Given the increasing importance of equivariance for scientific machine learning this is a major oversight. This project would aim to add a tutorial about equivariant modeling and ideally add an equivariant model to DeepChem.
  • Improving Antibody Support: DeepChem at present doesn’t have much tooling or support for working with anbtibodies. This project would add suitable antibody datasets to MoleculeNet and create a tutorial walking users through antibody design and modeling with DeepChem. If necessary, students may add antibody-specific models as well.

Community members, please add on more suggestions!

1 Like

Hey!

I am an undergraduate sophomore majoring in Computational Natural Science.

As I was going through DeepChem’s work, I found the project for HuggingFace integration quite interesting. Having used HuggingFace in the past, this project would enable me to add another fantastic model to it!

It would be great if anyone could help me out on how I could get started on this project. Since the integration has already been explored, where can I find the relevant projects?

If this project is currently not being worked on, please let me know if there’s anything else I could do.

P.S. I have seen a couple of models on https://huggingface.co/DeepChem; since DeepChem provides various endpoints, does this project aim to generalize the workflow for adding HuggingFace interfaces for all the endpoints?

The initial goal for this project would be to wrap a HugggingFace model as DeepChem model. For example, we would love to wrap ChemBERTa (https://github.com/seyonechithrananda/bert-loves-chemistry) as a DeepChem model.

@seyonec Do we have any example code that GSoC students can use to get started?

As a general note, until GSoC starts, we do allow multiple students to explore the same project. We will work with selected students if they are selected to coordinate their project efforts to avoid overlaps. I’d recommend just exploring your interests at this early stage and not worrying about whether someone is already working on it :slight_smile:

1 Like

Heyy!
Thanks a lot.

I actually was wondering if there’s code that I could see from last year, as you said some students already explored it. That would be very helpful in getting started with the project!

I am an aspirant software engineer and problem solver with strong problem-solving and strategic planning skills. In partnership with my fellow classmates, I have worked on and successfully completed a variety of projects, which has taught me how to collaborate with others when working on a project. I’m proficient in programming languages such as C, C++, Python and Javascript with a basic comprehension of Java. I’ve worked with Django, DRF, and ReactJS, among various other frameworks, and am also willing and curious to learn many other technologies.

I’m interested in the PyTorch Porting project and was wondering how I could start working towards a proposal. What would be some good first steps?

@bridyash13 Check out the list I just added at https://github.com/deepchem/deepchem/issues/2863. There are a lot of models left to convert. I’d recommend joining one of our developer calls if you’re interested in discussing (Joining the DeepChem Developer Calls)

Thank you for taking out your time to reply. I’ll start understanding the codebase and start converting the models asap.

Dear bharath,

My name is Umang Pandey and I am a senior undergraduate student at IIT
Kanpur. I have a decent research experience in ML/AI and I would love to
be a part of the GSoC this year and make some meaningful contributions to
your organization.I am especially interested in the projects - PyTorch
Porting and Improved PINNs support. However, I am open and happy to work on
other deep chem projects. Please Let me know if there are any such
opportunities or tasks available that might make increase my
chances of working with you. Also, what do you look for in an ideal
candidate? Any advice on how to get started would be
greatly appreciated.

Can you join a developer call to introduce yourself and discuss?

In general, if you are able to start meeting the community and discuss project ideas early it will help a lot. Merging PRs into DeepChem takes effort so the more practice the better :slight_smile:

Hi, Mr.Bharath,
I’m a pre-final year student studying in Government Engineering College, Erode.
I’m passionate about AI. At the same time, I love Biology very much. So I’m exploring the intersection of both.
I’ve experience in ML and also comfortable in building NN’s using PyTorch.
So I’m planning to work on Improving Antibody support project.

1 Like

Hi I am Sherif.

I am f 3rd year PhD student in biochemistry. I am interested in ML and AI for drug disocvery. I have an interest in working on Antibody support project and I wonder if one can help me with this a mentor.

Could you join the DeepChem office hours to discuss? Announcing the DeepChem Office Hours

it will be difficult for me since i am in germany and 7 pm is too late for me