DeepChem GSoC 2025 Project Ideas

Hi DeepChem Team,

I’m Grace Zhou, a grade 12 student from Beijing, China. I’m eager to participate in an open-source project to enhance my coding skills and build a strong foundation for my future college studies.

I have strong experience in programming with Python, C++, Java, and Swift, and I’ve worked on several projects involving neural networks and machine learning. Beyond computer engineering, I’m also passionate about chemistry and have hands-on experience with tools like Raspberry Pi, Arduino, 3D printing,etc., which has given me a solid foundation in interdisciplinary problem-solving.

I’m particularly interested in the “Layer Tutorials” and “Improving Support for Drug Formulations” projects. I enjoy creating clear and practical tutorials and have experience working with Jupyter/Colab notebooks. I’d love to help improve DeepChem’s documentation and contribute new layers that can benefit the community.

For the Drug Formulations project, I’m fascinated by the intersection of chemistry and computational methods. I’ve already reviewed the Modern Medicinal Chemist’s Guide to Formulations and am eager to explore how DeepChem can support the design of drug formulations.

As a first-time open-source contributor, I’m eager to learn, collaborate, and contribute to a project that bridges chemistry and technology. Looking forward to hear from you and discuss how I can contribute.

Best regards,

Grace Zhou

Dear DeepChem Team,

I’m YESHVANTH RAJU KURAPATI, a Master’s student in Computer Software Engineering at San Jose State University and a Software Development Engineer with over 1.5 years of industry experience in cloud-native AI systems. My background includes building scalable microservices using Kubernetes, Apache Kafka, AWS, and developing robust machine learning solutions with frameworks like TensorFlow and Hugging Face Transformers.

I’m excited to apply for GSoC 2025 under DeepChem and am particularly interested in the “PyTorch Porting and HuggingFace Style Easy Pre-trained Model Loading” project. I believe my hands-on experience with modern ML pipelines and CI/CD automation, along with my strong foundation in distributed systems, uniquely positions me to contribute to enhancing DeepChem’s capabilities.

I would appreciate any guidance on how to get started—whether that means exploring beginner-friendly issues or prepping a small prototype that aligns with the project goals. I’m eager to leverage my full-stack and cloud expertise to make a meaningful impact on DeepChem’s innovative projects.

Thank you for considering my application. I look forward to the possibility of contributing to your team!

Best regards,
YESHVANTH RAJU KURAPATI

Hey DeepChem Team,

I’m Priyanshu Anand, a Computer Science student with Python, ML, and Git experience. I’ve worked on projects like disease prediction and sentiment analysis. I went through the GSoC 2025 projects and am excited about the “Layer Tutorials” (Beginner Friendly) project. I’d love to help create new tutorials and improve DeepChem’s layer documentation. If there are any small tasks or resources you recommend for getting started, please let me know!

Thanks, and looking forward to contributing!

Hi DeepChem Team,

I’m Sahil Khan, and I’m really excited about the opportunity to contribute to DeepChem through GSoC 2025! Here are a couple of projects that caught my interest:

Implementing a Wishlist Model: I’d love to work on integrating a new model like Hamiltonian Neural Networks into DeepChem. My background in deep learning makes this a great fit for me.

Improve Equivariance Support: I’m also really interested in extending DeepChem’s equivariance support.

A bit about me
I’m currently a 3rd Year Computer Science student doing B.Tech from Amity University , Uttar Pradesh , India. with experience in Machine Learning,Various programming languages (Java,
C++, Python),Web Development (Front-end),GitHub collaboration, Application design. I’ve contributed to many orgnizations through internship and love working on challenging problems in AI.

Next steps
To get up to speed, I plan to:

:white_check_mark: Explore the DeepChem codebase and work on small issues.
:white_check_mark: Join the DeepChem Slack/GitHub Discussions to connect with the community.
:white_check_mark: Dive deeper into the theoretical foundations of the models I want to work on.

I’d really appreciate any guidance on how I can best prepare. Looking forward to getting involved and contributing!

Hi Bharath and the DeepChem team,

I hope you’re doing well.

My name is Saad Ali, and I am a final-year Software Engineering student at NED University, Karachi, Pakistan. I am eager to contribute to DeepChem through Google Summer of Code (GSoC) 2025.

I have experience training and using pre-trained models and have worked with various machine learning frameworks. Currently, I am focusing on preserving model integrity.

I am particularly interested in the “HuggingFace-style Easy Pretrained-Model Load” project, as I believe my expertise aligns well with its objectives. I would love the opportunity to contribute and make a meaningful impact.

Here is my GitHub profile: saadkhi

Looking forward to your guidance on how I can get started.

Best regards,
Saad Ali

Hey DeepChem Team,

I’m Priyanshu Anand, a Computer Science student with experience in Python, ML, and Git. I’ve looked through the GSoC 2025 projects and am really excited about the “Layer Tutorials” project. I’d love to help create new tutorials and improve DeepChem’s layer documentation. If there are any small tasks or resources you recommend for getting started, please let me know!

Thanks, and looking forward to contributing!

Dear DeepChem Team,

I hope you’re doing well. My name is Priya Raj , and I am a second-year Computer Science and Engineering student at NIT Jamshedpur . I am highly interested in contributing to DeepChem as part of Google Summer of Code (GSoC) 2025 , specifically to the Layer Tutorials (Small - 90 hours) project .

I have a strong background in machine learning and deep learning , with hands-on experience using TensorFlow and Keras . Additionally, I have worked on sentiment analysis and text classification projects, integrating ML models into web applications using Django and the MERN stack . My experience in both ML model development and deployment makes me confident in creating structured, high-quality tutorials that will help users understand and implement DeepChem’s layers effectively.

I am eager to contribute and would love to discuss how I can best align my skills with the project’s requirements. Please let me know the next steps or any prerequisites I should complete to get started.

Looking forward to your guidance!

Best regards,
Priya Raj
NIT Jamshedpur

Hi @bharath ,
I’m interested in the “Model-Parallel DeepChem Model Training” project and I’ve taken some time to get familiar with deepchem and have submitted a PR to fix a compatibility issue: https://github.com/deepchem/deepchem/pull/4313.

I’ve also made initial efforts to implement a prototype of Distributed Data Parallel (DDP). You can check it out here: https://github.com/Force1ess/deepchem.
It includes:

  • A ParallelWrapper to convert DiskDataset into a PyTorch DataLoader with Distributed Data Sampler support.
  • A non-multi-GPU baseline.py and a multi-GPU test_ddp.py for comparison.
  • Testing on 4 GPUs showed a** ~2.7x speedup** (though this may vary due to non-exclusive GPU access).

Moreover, I have a few questions to align with the project:

  • I noticed PyTorch Lightning is included in this project. Does the model-parallel implementation need to be built on it, or can I use native PyTorch (my preference)?
  • Some models use TensorFlow/JAX, but the idea list mentions converting to PyTorch. Does this mean I can focus solely on PyTorch-based model-parallel solutions?
  • What’s the expected scope for training acceleration? Which approaches should I prioritize (e.g., Data Parallel, DDP, FSDP, DeepSpeed)?

Thanks for your time—your answers will help me refine my proposal and better understand the project’s direction!

Dear DeepChem Team @bharath ,

I am Ayush Shaurya Jha , a pre-final year B.Tech student at IIIT Ranchi , specializing in Artificial Intelligence and Data Science . With a strong background in machine learning, deep learning, and large-scale model optimization , I have worked extensively with PyTorch, TensorFlow, JAX, and Hugging Face , developing and fine-tuning complex AI models. My research and projects focus on model parallelism, numerical optimization, and scalable architectures , making me highly interested in contributing to DeepChem’s advanced projects.

I am interested to contribute to any of the advanced projects " Model-Parallel DeepChem Model Training", " HuggingFace-style easy pretrained-model Load", " PyTorch Porting", " Implement a Wishlist Model", as all of them aligns perfectly with my skill set and experience.

Would love to hear from you!
Regards,
Ayush Shaurya Jha,
Email: shauryasphinx@gmail.com
Github: https://github.com/jhaayush2004

Dear DeepChem Team,

I hope you’re doing well. My name is Aditya M Patil, a second-year Computer Science and Engineering student. I am excited about the opportunity to contribute to DeepChem through Google Summer of Code (GSoC) 2025, particularly to the Layer Tutorials and Improving New Drug Modality Support projects.

I have experience in machine learning, deep learning, and data analysis, working with TensorFlow and PyTorch on predictive modeling and visualization projects. My background in technical writing and tutorial development makes me confident in creating structured learning resources to enhance DeepChem’s documentation and usability.

I would love to learn more about how I can contribute and would appreciate any guidance on the next steps or prerequisites. Looking forward to your response!

Best regards,
Aditya M Patil

Subject: Interest in GSoC 2025 – Numpy 2.0 Upgrade Project

Dear DeepChem Team,

I’m a third-year Electronics Engineering student and am excited to apply for GSoC 2025 under DeepChem. The Numpy 2.0 Upgrade project caught my attention as I want to gain hands-on experience with real-world software version migrations and debugging.

I have experience with Python and NumPy and have worked on projects involving data processing and AI applications . I’m eager to contribute, learn from the community, and improve my skills in open-source development.

Could you please guide me on how to get started and make meaningful contributions? Any resources or pointers would be greatly appreciated!

Looking forward to your response.

Best regards,
B Mounika

Hi DeepChem team!

My name is Yashna H, and I’m a student at UC Berkeley studying Chemical Biology and Computer Science. I’ve had extensive experience in molecular chemistry and machine learning, so I’m really excited for the opportunity to work on such interdisciplinary projects!

I’m particularly interested in Conversion of Smiles to IUPAC and IUPAC to smiles and PyTorch Porting. My current research is centered in gradient boosting, computer vision, and nanochemistry. I’d love to connect and discuss the projects and how I could contribute! Please let me know what next steps would entail.

Best regards,
Yashna H

Hi DeepChem team,
I’m interested in applying for GSoC 2025 under DeepChem and have explored the project idea Improve Equivariance Support

  • Why this project- I feel Equivariant neural networks have demonstrated significant improvements in modeling accuracy and generalization across scientific domains by explicitly encoding symmetry transformations (rotations, translations, permutations).
    I am quite excited for this project, having a background into it would love to learn and contribute

  • What do I aim to do- I aim to leverage and improve current research going on in this field (I would like to discuss my ideas and their technical feasibility with mentors.) Benchmark these implementations against existing DeepChem baselines using standard datasets from MoleculeNet.

  • About me- I am a sohpomore at IIT Madras working at a lab under a professor to discover underlying equations, symmetries and invariances of a physical system. I am aware about mathematical concepts of equivariance, invariance and group theory. Have done literature survey in relevant domian of this project and implemented mutliple ideas.

Not sure if this is correct platform to connect
@mentors (@Aryan, Riya, Nimisha, Bharath, Shreyas), what to know more and discuss about the project and about how to contribute and where to connect
Thank you, waiting for a reply

Hi Aryan, Riya, Nimisha, Bharath and Shreyas,

I’m interested ti work on improving equivariance support. I have a background on graph and group neural models, so ig I’ll have a basic background to start with, and also I’m expecting to learn a lot.

Could you get me started on how to apply, the selection procedure and so on?

Best regards,

Subject : Project Interests – Numpy 2.0 Upgrade and Pytorch Porting

Dear Team,

I am Kushagra Prajapat, currently pursuing a B.E. in Information Technology. After reviewing the project ideas, I believe that I am well-suited for the following two projects:

  1. Numpy 2.0 Upgrade
  2. Pytorch Porting

I have been learning Python since 2021 and have built a solid foundation in the language. I am confident that these projects align perfectly with my skills and knowledge. I am excited about the opportunity to contribute and further enhance my expertise in these areas.

Thank you for considering my application.

Best regards,
Kushagra Prajapat

Hello @bharath, @shreyasvinaya sir,
I came across the SMILES to IUPAC and IUPAC to SMILES conversion projects and found them highly intriguing. Given my background in both chemistry and technology , I believe I can make meaningful contributions to this initiative.

In my past projects, I’ve worked on integrating chemical data with computational tools , which has given me experience in handling molecular structures and optimizing algorithmic workflows. I’ve already begun researching DeepChem’s molecular transformation capabilities , existing solutions for bidirectional conversion, and potential challenges in ensuring accuracy and efficiency.

I’d love to hear your insights on the best approaches or key challenges you foresee in this project. Are there any specific resources or references you’d recommend as I continue exploring this?

Best regards,
Anuj Singh

Hi DeepChem Team,
I am Almouthana Taha Khalfallah, an Applied Mathematics and Modeling Engineering student specializing in Data Science and AI, with a strong passion for coding and mathematical modeling. In addition to my current academic background, I completed two years of preparatory studies where I gained a solid understanding of chemistry and physics, which enhanced my analytical and problem-solving abilities.

I am particularly excited about the opportunity to contribute to the “Improve Equivariance Support” project, as it aligns perfectly with my academic interests and goals. With my knowledge in machine learning, deep learning, and mathematical modeling, as well as a solid foundation in mathematical concepts, I believe I can make meaningful contributions to this project. I am eager to further enhance my skills and knowledge while working on cutting-edge research that can contribute to advancements in AI models and their robustness.

I look forward to the opportunity to collaborate with your team and gain invaluable experience in this area.

Best regards,
Almouthana Taha Khalfallah
My portfolio: https://taha2053.github.io/ATK-LOG.github.io/

Hello DeepChem Team,

I look forward to contributing to these projects as I apply for GSoC’25 : Layer Tutorials and Improving New Drug Modality Support. I am a beginner to open source and have plenty of experience in Python, Git and Jupyter Notebooks as required for these projects.

Please let me know how to contact the mentors so that I could get deep insights on what needs to be done in these projects.

Thank you,
Jisnoo

Hello!
I am very interested in the Improve Equivariance Support project for GSoC 2025. I would love to learn more about the project, specifically about the technical challenges and knowledge required to contribute effectively.

To briefly introduce myself, my name is Matteo Bando, and I am currently pursuing a Master’s degree in Artificial Intelligence. I am highly motivated to apply my knowledge and contribute to the project’s success.

I would be happy to discuss the project further and provide any additional information. I have linked my CV for your reference.

Thank you for your time, and I look forward to hearing from you.

Best regards,
Matteo Bando

CV
LinkedIn Profile - you can even find my website here =)

Hi DeepChem Team,
I am CC, currently pursuing master’s degree in information systems at Northeastern University, and are really interested in the project HuggingFace-style Easy Pretrained-Model Load for GSoC 2025.
My experience and skills include:

  • Hugging Face API : I have hands-on experience integrating pretrained models and designing modular APIs. See TaskManager: A React Native mobile application built with Expo and TypeScript, integrated with Gen AI APIs.
  • NLP & Deep Learning : I have built CNN models (Presentation: Surface Crack Detection) and NLP-based text processing (Used sentence transformers to curate biomedical terms across datasets).
  • Clinical Research & Survival Analysis : I’ve worked with R and SPSS on SEER datasets, developing predictive models and nomogram visualizations.
  • Git, Python, TensorFlow/PyTorch : Experience in open-source contributions, model fine-tuning, performance optimization, and making models portable across different hardware platforms.

This project will make it easier for researchers to apply models in computational chemistry and drug discovery. I am exploring the DeepChem codebase and will submit my first draft proposal soon. Looking forward to your guidance!

Best regards,
CC